Crowdsourcing the linguistic landscape of a multilingual country . Introducing Lingscape in Luxembourg

This paper introduces the citizen science mobile application Lingscape. This free research tool for Android and iOS smartphones uses a crowdsourcing approach for research on linguistic landscapes. The paper discusses the use of mobile applications and crowdsourcing in linguistics, methodological requirements and problems of an app-based approach to the study of linguistic landscapes, and the key features of the app Lingscape. It considers the Luxembourgish cultural super-diversity as well as existing studies about the Luxembourgish linguistic landscape to set the background for the pilotstudy.


1
Introduction -Linguistic landscapes in sociolinguistics For the longest time, the semiotics of signage in urban communities or multilingual countries has not been part of the sociolinguistic analysis of cultural practice.Only in the late 1990s -at least under the label linguistic landscapesdid the first studies appear to shed light on the different ways in which lettering practice in public space shapes the linguistic landscape of a given community: "The language of public road signs, advertising billboards, street names, place names, commercial shop signs, and public signs on government buildings combines to form the linguistic landscape of a given territory, region, or urban agglomeration."(Landry/Bourhis 1997: 25) Following this influential definition there has been some substantial effort in the field over the last 20 years concerning the study of linguistic landscapes in multilingual (urban) communities. 1Gorter (2006) gives an overview of different methodological approaches assembled under the umbrella term linguistic landscapes, ranging from the analysis of sociolinguistic constellations in multilingual communities (cf.Sciriha/Vassallo 2001) to studies that focus on communication patterns between patients and therapists (cf.Fleitas 2003) or even olfactory impressions of a given landscape (cf.Pennycook/Otsuji 2015).But for the most part, researchers have adopted the definition by Landry/Bourhis (1997), so that nowadays most research on linguistic landscapes still focuses on the complexity and dynamics of lettering practice in public space (cf.Shohamy et al. 2010, Backhaus 2007, Scollon/Scollon 2003).While some researchers document the presence of specific varieties like minority languages (cf.Gorter et al. 2012) or dialects (cf.Long/Nakai 2014, Reershemius 2011, Ziegler et al. 2017) in linguistic landscapes, others analyze specific places and functional buildings, such as train stations and airports (cf.Domke 2014), or typographical aspects of signs (cf.Wachendorff 2015).All this parallels the development of an analytical framework for typological (cf.Backhaus 2007), semiotical (Scollon/Scollon 2003), functional (cf.Auer 2010), and sociocultural (cf.Blommaert 2013) differences of signs and lettering in public space.Blommaert/Maly (2014) argue for a more ethnographically informed analysis of changing social practices, and Blommaert (2016) criticizes the "traditional" approaches to linguistic landscapes research for ignoring social dynamics and the fast-growing importance of virtual spaces for the structuration of social practice for the most part.
Despite all these efforts, data collection and analysis for research on linguistic landscapes is somewhat arduous and problematic, due to the density of signage in public space, as well as the co-presence of many different languages and scripts on signs, which need to be translated before the analysis.And although there has been some effort to develop new and standardized methodologies for the field (cf.Barni/Bagna 2015, Shohamy/Gorter 2009), some methodological and theoretical issues have still not been addressed in detail yet.The main purpose of this paper is therefore to discuss methodological implications of an app-based approach to the study of social semiotics.The main reason to do so is the recent emergence of a crowdsourcing approach in linguistic landscapes research through mobile applications.
The following aspects need to be considered regarding basic methodological conditions of an app-based approach to study signs and lettering in public space.All of them also apply to research on linguistic landscapes in general.Here, they will be discussed against the background of a crowdsourcing approach.
1. Inherent dynamics of the linguistic landscape regarding lettering practices: official (e.g., new roads and places that have to be named), commercial (e. g., new stores replace old ones), private (e. g., residential announcements), or subcultural (e. g., graffiti that may be wiped off); 2. Emplacement of the collected data as for different authors, audiences, purposes (e. g., information vs. order), or intentions (e. g., public announcement vs. advertising);2 3. Lack of representativity of the collected data as related to the totality of signage in a community as well as the user population; 4. Granularity of the collected data regarding "para-lettering" (e. g., manufacturer name on the screw fixing a signpost) as well as "meta-lettering" (e. g., street art sticker on a street sign) or "response lettering" (e. g., comments on a private announcement) that may lead to complex semiotic layering on signs and surfaces; 5. Accessibility of the collected data with respect to the copresence of different scripts (e. g., restaurant menus in multicultural neighborhoods) and languages (e. g., signposts in touristic areas); 6. Categorization of the collected data for analysis; 7. Legal obligations due to copyrights, trademark rights, and property rights for signs and lettering; ISSN 1615-3014 183 8. Data presentation and analysis in view of thousands of photos taken during data collection.
Some of these problems are of a more general kind and concern all research on language (dynamics, contextualization, representativity), some concern methodological decisions regarding the research design (granularity, accessibility, categorization), and some are specific problems of the photo-based approach (legal obligations, presentation and analysis).Still, all of them need to be addressed in order to justify the way in which linguistic landscapes are surveyed (see Section 5).Purschke (forthcoming) further discusses the methodological challenges and scientific potential of a citizen-science approach to the study of social semiotics.

Mobile apps in linguistics
One relatively new and innovative way to tackle these problems is the use of a crowdsourcing approach (cf.Eskenazi 2013, Howe 2006) to study linguistic phenomena by means of smartphone technology (cf.Wang et al. 2016).Compared to traditional data collection approaches -which in many cases make use of western student groups during class (cf.Henrich et al. 2010, Gosling et al. 2010) -crowdsourcing approaches have been proven to allow reliable data collection from a diverse pool of participants (cf.Behrend et al. 2011).While linguistics for the longest time has not used this approach, lately there are some projects that employ mobile devices in data collection.Especially in dialectology, there are currently several apps available that follow a crowdsourcing approach to collect audio data, document language change, perform speaker localization quizzes, or collect evaluations of speakers' accents (cf.Van Leeuwen/Orr 2016, Hove et al. 2015, Kolly/Leemann 2015).This way of collecting linguistic data offers completely new possibilities regarding the sheer amount of data, but also implies some methodological limitations, such as multiple submissions by and anonymity of the respondents, technology-induced problems concerning connectivity, and data quality as well as interferences caused by the research design and input interface (cf.Leemann et al. 2016).However, the first analyses of data gathered with such apps demonstrate the potential of this approach for the study of language variation and change (cf.Leemann et al. 2015aLeemann et al. , 2015b)).Beyond that, these app-based approaches not only offer the participants a way to directly engage in the research process, but at the same time make use of gamification elements (rating of uploaded accent recordings) or visualizations of different aspects of the participants' speech behavior (accent origin, speech rate, average pitch) as reward for the users' submissions.In doing so, mobile crowdsourcing opens up new perspectives for public engagement in scientific research as well as for science communication.
Regarding the study of linguistic landscapes, there have not been comparable approaches up to now, even though the technical specifications of current smartphones provide ideal preconditions for such an app-based approach.This includes high resolution cameras and GPS modules, as well as storage capacity and high-speed mobile networks.As for the requirements for research on multilingual lettering practices (high quality photos including geolocation and metadata), it is obvious that a crowdsourcing approach to both data collection and processing may substantially support this kind of research, by allowing large numbers of geolocated photos to be collected throughout entire communities (and even countries).Compared to the classical approach to linguistic landscapes research -walking around taking photos of individual signs and lettering along the way -the main advantage of a crowdsourcing approach revolves around the fact that a survey by mobile app makes it possible to involve large groups of users in the data collection process, which may result in large collections of signs and lettering that allow quantitative analyses.The biggest disadvantage of the approach, though, may be that with this kind of survey, many aspects regarding data quantity and quality (including metadata) lie in the responsibility of the users.But although this may seem as a risk to data collection, in fact this may turn out to be one of the biggest strengths of such projects like Lingscape (see Section 3), since the data we collect represents not our expert view on the landscape but combines thousands of individual perspectivations of visible multilingualism in public space.This basically means that our data collection does not only reflect our particular research interest and perspective on the linguistic landscape, but the many different ways in which our users perceive the linguistic landscape in everyday life.
Although app-based approaches to the study of linguistic landscapes have been missing up to now, Carla Bagna and Monica Barni already made use of similar techniques very early on to collect data in Italy (cf.Barni 2008, Bagna/Barni 2006).In order to link the collected photos directly to specific geolocations on the map, data collection was conducted by two researchers, one equipped with a digital camera, the other one with a handheld computer running a GIS software (cf.Barni/Bagna 2009: 132).Therefore, not only a geolocation could be set for every single photo, the software (MapGeoLing) also allowed an initial classification of the collected texts.However, due to technical innovations both in camera and communication technology (even most digital cameras nowadays include GPS modules) present-day smartphones offer complete new possibilities for app-based data collection, especially regarding the potential of crowdsourcing approaches.
At this point in time, there are two projects that use smartphone applications for research on linguistic landscapes: one is part of the "Multilingual Manchester" project (LinguaSnapp; cf.Gaiser/Matras 2016),3 the other one -Lingscape -was developed by the University of Luxembourg (cf.Purschke 2016) in cooperation with the Swiss software studio ibros.ch. 4 Since both apps base their data collection on similar technology,5 mainly Lingscape will be discussed in the following, although both projects differ regarding the methodological decisions taken for implementation (see Purschke forthcoming for a discussion).One major difference between the two projects lies in the fact that Lingscape collects public lettering irrespective of the languages visible on the signs (based on the users' interests), while Linguasnapp only documents signs that contain languages other than English.The main reason for this difference may relate to the different sociolinguistic settings in Manchester and Luxembourg (multilingual country vs. monolingual city, both with a high percentage of foreign residents). 3 The app Lingscape The mobile app Lingscape is a free research and teaching tool for iOS and Android designed for the survey of polymorphic linguistic landscapes around the world. 6The main goal of the app is to bring together researchers and citizens in a joint research process: Users are not only part of the data collection, but can also actively contribute to project development and learn more about the cultural and semiotic complexity of the society they live in.During the initial project phase (starting from September 2016) data collection focused on Luxembourg and the surrounding countries.In January 2017, we have published an update that introduces an administration backend for external projects conducted with Lingscape as well as an interactive online analysis tool.7 Right now, there are several people and projects in Europe working with the app.
Beyond that, a pilot study that introduces Lingscape as a digital teaching tool is currently under preparation. 8The second project phase (starting from 2018) will introduce project-specific annotation tools as well as the transition of the software into an open source project.The app is currently available in four languages (English, French, German, and Luxembourgish) reflecting Luxembourg's multilingual constitution.It features two core functionalities which will be presented in more detail here: (1) Map viewer and (2) Photo upload.Access to the app and photo upload are unrestricted, which means that the users can anonymously add photos and browse the map viewer.No user data is collected except for the telephone ID (due to legal obligations).Before entering the app, every user needs to complete an introductory tutorial (Figure 1, left panel) and agree to the terms and conditions of the app.9

Map viewer
After finishing the tutorial, the user enters the app home screen which consists in a map viewer showing all uploaded photos (Figure 1, middle panel).The app uses an implementation of the Google Maps API.At the beginning, the map is centered on the user's current location (if location services are activated; if not, the map initially shows Luxembourg).Users can choose between satellite and map view via the divided button on top of the map.All uploaded photos are directly accessible through the map viewer: Tapping on a red pin opens a small preview for the chosen entry, a second tap then leads to the detail view for the respective photo (Figure 1, right panel), containing information about the location of the sign, the visible languages, and user-generated comments.Beyond that, users have the possibility to share the photo via social media or report inappropriate photos, insulting comments, or other types of misuse.

Photo upload
The upload process can be started by tapping the button Add a photo on the home screen (Figure 1, middle panel).Users can choose between the camera or their local media library as a source.After taking a photo with the camera or choosing an existing one from the library, the photo can be cut to size so that only the sign or lettering is depicted (Figure 2, left panel).The app uses a maximum image size of approx.1,5Mb (compression during upload with ca.80% image quality and up to 3000x3000 pixels) as a compromise between image quality and file size (= upload time).Afterwards, the exact location of the photo can be defined directly on the map (Figure 2, middle panel).If the photo already contains GPS coordinates, the location will be set automatically on the map.If the photo does not have GPS coordinates yet or the displayed location isn't exact, the user can manually correct the location by tapping on the map.The possibility to correct (or define in case of photos taken without an active GPS module) the location is an important feature of the app due to the often limited precision of GPS locating via smartphones.Also, users do not need to have the GPS module activated while taking photos.
In a third step, the users can define the visible languages on the photo and also enter a comment with additional information like context details, visible scripts, varieties of languages (Figure 2, right panel), or even their name.10By tapping the button Submit photo the photo is then transferred to the server and instantly added to the map, so that the user gets a direct confirmation of his/her contribution (including a pop-up window with the user's personal contribution count).Uploaded content is moderated afterwards by the Lingscape team to make sure that inappropriate material is deleted from the server as fast as possible.This approach to data collection constitutes a certain risk for misuse by the users, but it does not leave the user without any direct feedback to his/her prior action as it would be the case with an "upload and approval"-method (see Section 5.5).16.4% of all residents), French (7.5%),Italian (3.6%), Belgian (3.4%), or German (2.2%) backgrounds.In total, Luxembourg is home to people with around 170 different nationalities.One main factor for the present cultural super-diversity is the large amount of socioeconomic migration and (trans-border) commuting due to the relatively strong position of the Luxembourgish economy as well as the presence of official EU institutions in Luxembourg City. 12The second main factor relates to historical reasons: For the longest time, Luxembourg was in close contact (or even part of) the respective neighboring empires, before it was declared fully independent in 1890.
Today, Luxembourg is officially trilingual, with Luxembourgish as a national language (and mother tongue for born Luxembourgers) and French and German as official languages.13Among these, Luxembourgish is currently undergoing extensive linguistic and socio-cultural processes of "Ausbau" (cf.Kloss 1967), including standardization (i.e., dialect leveling; cf.Gilles 2006), codification (cf.Gilles 2015), and spread to new domains (e. g., emergence of a private literacy in digital media; cf.Belling & de Bres 2014). 14In everyday life, many Luxembourgers use at least three, if not four or five languages on a regular basis (cf.Fehlen & Heinz 2016, Gilles et al. 2010). 15Due to the sociocultural complexity, language practice in both public and private space is in fact polylingual, including (at least) English, Portuguese, and Italian as contact languages, with specific linguistic constellations for different domains of language use.For example, French (private sector) and Luxembourgish (public sector) are the dominant languages in work life, while German serves as literacy language in schools.Media are mostly in German and French (and sometimes Luxembourgish).Plenary sessions in the house of representatives require Luxembourgish, but laws are written in French.And of course, the importance of English as a bridging language is also increasing in Luxembourg.
Starting from this complex situation, the multilingual practice in Luxembourg may serve as a paradigmatic case for the study of linguistic landscapes, especially regarding the question of how public lettering practice directly impacts the complex conditions of Luxembourgish multilingualism, especially considering the interplay of different structuring factors like language policy, language use, and social hierarchizing of languages, all of which play an important role for the presence and presentation of specific languages on signs and lettering.To this day, only very few studies have been carried out documenting the polymorphic linguistic landscape in Luxembourg.Basically, there is one overview article by Gilles et al. (2010) and two master's theses examining different aspects of lettering practice in public space (cf.Heissler 2008, Garand 2011). 16All three studies will be briefly discussed in the following, also to make their results available to the linguistic landscapes community.
In a holistic approach, Gilles et al. (2010) conducted a study of mono-and multilingual signs in Luxembourg with a corpus of around 600 photos, including official lettering (top-down) as well as commercial or private signs and posters (bottom-up). 17For the monolingual signs, they found a strong predominance of French both in top-down (institutional) as well as in bottomup (commercial and private) signs.For the multilingual signs (two or three languages), they found a strong preference for specific language constellations in top-down communication (e. g., French-German and French-German-English).Compared to the top-down signs, there was a lot more multilingual heterogeneity in bottom-up signs.Luxembourgish in general did not play a big role in public lettering, except for place name signs, where it was often used as second language after French.However, the role of Luxembourgish in public lettering may have changed over the last 10 years due to the public enhancement and political fostering of Luxembourgish as major means for the formation of a national identity.Heissler (2008) took the fact that Luxembourgish does not play a big role in public letteringexcept for informatory commercial communication -as a starting point for her analysis of multilingual discourses in a Luxembourgish supermarket.The study concentrated on the supermarket as a defined semiotic space with a specific communicative function.Therefore, all visible signs that were at least partially linguistic were taken into consideration.Still, all movable items, like brochures or shopping carts, were left aside.Also, products, price lettering, and signs of external manufacturers were only included exemplarily.In order to structure the collected data and to define the semiotic space "supermarket", seven discourses were defined: orientation, organization, service, advertising, products, price lettering, and security (cf.Heissler 2008: 49).The corpus for the analysis consisted of roundabout 800 photos including context and detail views.For the analysis, the collected signs were classified with regard to their provenience The analysis revealed different linguistic patterns for the seven defined discourses: While monolingual French signs dominated in the discourses orientation, organization, and price lettering, Luxembourgish was the most frequent language in advertising.For the other discourses Heissler (2008) found more diverse patterns of language use (including homophonic French-German signs with product explanations in the service discourse), that in many cases also depended on the respective product or manufacturer leading to polyphonic signs in some cases.But apart from this, most explicitly multilingual signs were of a homophonic kind (offering explanatory translations) with a code preference for French over German.This predominance of French can easily be explained by the fact that up to 80% of all employees in the private sector have a foreign (mostly French, Belgian, or Portuguese) background, many of which are also daily commuters, so that also for many Luxembourgers French is in fact the most likely language for many activities in everyday life.Luxembourgish lettering in the supermarket mainly served as means of symbolic integration for the Luxembourgish customers, whereas German was only present on bilingual signs (next to French).The alternation of German and Luxembourgish lettering practice in this case may also reflect the situation that Luxembourgish literacy is still not fully fledged, so that German often still serves as "written Luxembourgish".
Garand (2011) surveyed the lettering of street signs in 58 Luxembourgish municipalities with respect to the (co)presence and hierarchizing of the three official languages.Beyond that, diachronic and motivational aspects of street sign lettering practice were surveyed as well, via a questionnaire handed out to persons currently or formerly responsible for the lettering.The corpus comprised 1054 photos of street signs documenting all official street signs in the respective municipalities.Since municipalities in Luxembourg are free to choose the languages displayed on street signs, "those in authority [may] use language in the public space to deliver symbolic messages about the importance, power, significance and relevance of certain languages or the irrelevance of others."(Shohamy 2006: 110) The study revealed that there are more monolingual (731) than multilingual (323) street signs in Luxembourg, with a slight predominance of French (56%) over Luxembourgish (44%) on the monolingual signs. 18German did not play a role in street sign lettering except for rare exceptions.On the multilingual signs, French was the dominant language by design and typography.Also, most multilingual signs (69%) were homophonic.While the strong position of French in public lettering can easily be explained by the historical predominance of French in administration, the study also revealed a growing number of Luxembourgish street signs, both in monolingual and multilingual settings.The main motive of the persons responsible for this fostering of Luxembourgish on public street signs (again) directly addresses the role of Luxembourgish as means for the formation of a Luxembourgish identity.
Taken together, the situation in Luxembourg offers all prerequisites for an interesting case study of multilingual lettering practices in public space, especially considering the complex copresence of French, German, and Luxembourgish in various discourses.Beyond that, the existing research up to now only highlights some of the many potentially interesting discourses to be surveyed with respect to the polymorphic linguistic landscape in Luxembourg.Using the Lingscape application makes it possible for us -with the help of our users -to broaden and deepen research on the linguistic landscape in Luxembourg in two regards, both of which offer the potential for new insights into visual multilingualism: 1. Variability of the landscape (quantitative approach): We will be able to document and analyze a multitude of signs throughout the entire country (and the surrounding areas) by means of the crowdsourcing approach to data collection and automated quantitative analyses of sign contents (e. g., regarding language hierarchization on signs as well as patterns of regional language dominance in signage).2. Perceptions of the landscape (qualitative approach): We will be able to analyze the participants' perceptions of visible multilingualism in public space by collecting participant statements about visible languages on the uploaded signs that will allow an analysis of how the participants perceive and categorize the copresence of different languages in public signage. 19

Challenges and opportunities of the approach
We can now discuss some ways in which Lingscape deals with the methodological aspects (as defined in the introductory section) that need to be addressed for a crowdsourcing approach to research linguistic landscapes in more detail.In some cases, specific methodological considerations concerning app-based data collection and analysis need to be taken into consideration, other issues concern research on linguistic landscapes in general.The theoretical background for this discussion revolves around the question of how we can make use of smartphone technology and a crowdsourcing approach to motivate users to participate in data collection for research on linguistic landscapes, especially regarding different aspects of app design and 18 One methodological problem with the study lies in the fact that the many French loan words in Luxembourgish and its structural proximity to German sometimes make it hard to decide, whether a displayed name should be judged as representation of Luxembourgish, French, German, or an amalgamation of these languages.Beyond that, many geographical and personal names appear on these signs, for many of which there are no official translations.
19 See for example the word "danger" in the detail view (Figure 1, right panel) which can represent English or French.Thus, the user's choice of language can reveal his/her perception of the linguistic landscape.
research methodology (see Purschke forthcoming for a discussion that also introduces a methodological framework for a community-based participatory research). 20he overall aim of the project is to engage with citizens (and students) in a joint research process that a) helps us to collect data for participatory research and b) fosters awareness for (visible) multilingualism as a constitutive element of everyday social practice in most modern societies.
There is in fact a potential benefit for the participants who use the app, which might motivate some of them to contribute photos.However, we cannot assume a high level of intrinsic motivation for the task we are asking of the participants, since the main purpose of the app is in fact research-oriented.Also, users in general prefer interactions serving their personal purposes, rather than contributing their own knowledge.So in order to offset this bias for data collection (cf.Hartson/Pyla 2012), the main guideline for app development and research methodology was to create an application that conforms to the following requirements: • Optimized user experience: The handling of the app should be as intuitive and easy as possible to make sure that user experience is inviting and smooth.This concerns the graphical user interface as well as a streamlining of the interaction processes for photo upload and map viewer and the possibility to interact with the app without a personalized login.We assume that the more accessible and welcoming the app is, the more likely it will be for a participant to interact with the app.• User-centered data collection: The upload function should contain a direct feedback for the user as a confirmation of his/her contribution.Furthermore, the upload process should be designed in a way that does not leave the participant with the impression that they simply fulfil a strict, predefined research interest that only allows "approved perceptions" of the landscape.Therefore, the user can decide what aspect of the landscape he/she wants to document.Furthermore, only few analytic descriptors are added to the photo during upload (location, languages, comment).We assume that the more open and satisfying the interaction with the app is, the more likely it will be for a participant to make contributions.• Fostering of user engagement: The app should be built in a way that ensures a frictionless user experience and increases the likelihood of re-use.This includes native app development for all platforms including regular updates as well as a close integration with platform-specific software requirements and interaction possibilities.Also, we will add new features to the app like gamification elements or a gallery showing recent uploads.Beyond that we are building a community around the project using social media platforms and special events, e. g., "Lingscape walks".We assume that the more long-lasting and entertaining the app is, the more likely it will be for a participant to repeatedly use the app and engage in the project.
These requirements are derived from the fact that a user's decision to interact with an application ("tool readiness") largely depends on his/her situated experiences with the tasks ("tool experiences") and interaction possibilities within in the app ("task situations") dependent on individual personality traits and attitudes towards the app ("user characteristics"; Sun 2016, 284-285).Regarding the implementation of the app-based approach in a crowdsourcing project, all three development principles define crucial preconditions for the Lingscape project to be successful in the long term (cf.Wang et al. 2016).The main methodological challenges for a photobased approach to the study of linguistic landscapes (see section 1) will be discussed in more detail in the following.

Inherent dynamics of the linguistic landscape
In many cases, signs and lettering in public space are relatively persistent, for example official road signs and street names.However, the constant change of urban linguistic landscapes is inevitable, especially when considering all types of transgressive signs,21 which may appear in the landscape at a moment's notice and be wiped off again the day after.Therefore, we can only try to accompany these changes by documenting new signs as soon as they appear.A crowdsourcing approach that directly involves the Luxembourgish citizens in data collection will make it easier to notice and document new signs and lettering in public space, including transgressive ones.Although most users are likely to only take part in the first data collection after the initial media announcements -mainly due to very low retention rates for mobile apps in general22 -, there will still be a large number of people involved in long-term data collection, since Lingscape will also be used as a teaching tool for student projects and in schools.Beyond that, the Lingscape map viewer and database allow for multiple photos within the same location, which can be ordered by their date of upload.This will allow diachronic analyses of lettering practices in specific locations like commercial zones.

Emplacement of the collected data
Another problem all linguistic landscapes research has to address is that the signs we photograph are part of (accidentally or intentionally) arranged semiotic scenes and embedded in everyday social practice of the passersby in a specific location.Beyond that, all signage is created by different authors (institutional, commercial, private, artistic;cf. Edelman & Gorter 2010) expressing specific intentions (e. g., public announcement vs. advertising) and pursuing specific purposes (e. g., information vs. order; cf.Ben-Rafael 2009).So, in order to fully capture the closely interwoven semiotic discourses involved in the creation of a linguistic landscape, we would have to document not only literally all signs in a given location (cf.Soukup 2016), but also to capture as many different contextual aspects of the sign as possible.Furthermore, we would need to reach out to shopkeepers, designers, ministries, street artists, or any other person responsible for the creation of a specific sign or lettering (cf. Malinowski 2009).This is of course impossible.Still, in order to account for the semiotic complexity of social space and the many different actors involved in its creation, a combination of different methodological approaches to the study of linguistic landscapes is not only highly desirable, but necessary (cf.Blackwood 2015).Since the main goals of the Lingscape project relate to a) the crowdsourced collection of signs and b) the involvement of the public in the research process, we will not be able to take account of all the different methods needed for a comprehensive analysis of social semiotic practice in public space.However, different aspects of the emplacement of signs in public space can be dealt with in well-directed student projects analyzing different types of scenes, actors, and discourses (cf.Scollon & Scollon 2003).

Lack of representativity of the collected data
Like every other linguistic landscaping project, Lingscape will not be able to collect a representative sample.While the collected data itself is representative, it comes from a non-representative population, which introduces a bias (presumbaly a younger audience that might photograph different signs than older people) to the sample.However, the open-access design of the app and the fact that we are creating awareness of the app on national media in Luxembourg gives us hope that our data basis will compensate for the lack of representativity by the sheer amount of collected photos.By collecting many different individual perspectivations of the landscape without imposing a specific research interest on our participants, we can prevent our corpus from representing only specific (scientific) views on the linguistic landscape.Of course, our users will only upload photos of signs and lettering on the map that are salient ("contextually conspicuous") or pertinent ("practically relevant") to them for some reason (cf.Purschke 2015Purschke , 2014)).But as stated above, this perceptual "bias" to data collection will be part of data analysis as well.
With multilingualism in general and the weal and woe of Luxembourgish in particular being prominent topics in public discourse, many users are likely to actively participate in the project, at least during the first data collection phase.Furthermore, we will use Lingscape as a digital teaching tool in academic and school curricula, so that we can involve our students (and their peers) in a continuous data collection process including student projects as well as specific linguistic landscaping tasks (e. g., capturing and analyzing all signs and lettering in a defined area).By this means we hope to be able to collect not only many photos but also to some extent reconstruct an accurate image of the multilingual landscape in Luxembourg.Still, data collection is limited by the medium used for collection.Since the app is based on smartphone technology, we can only collect data from smartphone users.This excludes many potential users (and perspectivations of the landscape) from the project.Nevertheless, our potential participant pool is quite big compared to other ways of data collection.

Granularity of the collected data
Since most of the data is collected by our users, the level of granularity to be documented is also decided by the crowd.We can assume, though, that most photos document a holistic view of the respective sign or lettering without paying much attention to para-or meta-lettering.Still, different types of semiotic layering or dialogic signs will become part of the corpus in form of "visible dialogues in public space" (cf.Schmitz/Ziegler 2016).This includes transgressive signs, like stickers or graffiti that are placed directly over other signs, as well as crossing out of certain parts of (official) signs or other forms of transgressive transformation of sign content using comments, corrections, clarifications, or continuations.Schmitz/Ziegler (2016) analyze such dialogues that were captured in the "Metropolenzeichen" project (cf.Ziegler 2013).They propose a typology of visible dialogues in public space, taking into account different types and emplacements of official and transgressive signage, as well as modes and motives of communication.Our data collection will show which types of dialogic signs are present in the Luxembourgish landscape.
Moreover, we can again profit from the use of Lingscape as a teaching tool in assigning specific tasks to students (e. g., to document all street art items in a specific area).As a consequence, our collected data is likely to be more diverse (and also messy) compared to well-defined and pre-structured surveys conducted only by trained researchers.Since one of our main goals is an automated quantitative analysis of the corpus, though, at least some of this should be averaged out during data processing.Also, in opening up the data collection process to citizens, we will be able to document the linguistic landscape in Luxembourg not only from our (trained) perspective, but from the many different perspectives of our users, which may lead to a more diverse yet comprehensive and insightful image of the Luxembourgish landscape. 23And while we are not collecting personal data from our users, we will be able to aggregate all photos uploaded by a specific user by dint of the telephone ID that is stored with every photo.This will in practice enable us to reconstruct not only a holistic image of the Luxembourgish linguistic landscape, but also (anonymous) individual perceptions thereof.

Accessibility of the collected data
For our pilot study in Luxembourg, it is unlikely that we will find many signs or lettering containing other than the Latin alphabet or exotic languages, so most lettering should at least be readable and translatable for our team.However, since we are not competent speakers of all languages that we will find in the Luxembourgish linguistic landscape, all photos will be processed by OCR in the second project phase in order to allow automated text extraction, language identification, and content analyses.Of course, this automated text extraction has some technical limitations with regard to blurred photos, hand-written signs, unusual type faces, graffiti, or texts that have been partly overwritten, crossed out, or stuck over with stickers.Still, available models and tools (e. g., Google Cloud Vision API)24 from visual recognition research give cause to hope that the vast majority of the collected photos can be processed successfully by OCR (cf. Tinoco 2011, Chen et al. 2004, Yang et al. 2001).Apart from that, the users have the possibility to directly add translations of signs during the upload process, which in itself can be used to fine-tune recognition algorithms.
Regarding user access to the collected data: all uploaded photos can be accessed by all users in the app via the built-in map viewer, including the possibility to share and download photos.Furthermore, we have already begun to build a content management system for researchers and teachers who use Lingscape as part of their project.This also includes the possibility to download, map, and analyze whole collections of photos and metadata.In this context, some design decisions for the app need to be reevaluated, particularly the symbolization of visible languages.Currently, the language entries show the original name and script (with an English translation) and a language icon that displays the official ISO 639-3 language code. 25Compared to more iconic visualizations of the tagged languages (like national flags that link a language to specific countries) this typology of languages has the clear advantage of ideological neutrality.Since we cannot expect our users to be aware of all ISO codes for the languages we use in the app, a complete reference of the ISO 639-3 codes is given in the "tips & tricks" section of the app.
Still, for some purposes (i.e., tagging non-official varieties like dialects) this typology has its limitations, although additional varieties can be tagged during upload using the "&" symbol in the comment area.Therefore, a modified language selection tool will be implemented in future versions of the app including the possibility to select regional sets of languages/varieties for different projects.Furthermore, there will be the possibility to add project-specific annocation categories to the upload process, so that researchers can build their very own corpus (with personalized access for data processing) within Lingscape.
In general, Lingscape takes an open-access approach to the app and the collected data: No user account or social log-in is required to use the app.Instead, users can anonymously interact with the map viewer and photo upload, once they have accepted the terms and conditions.The only user information collected with every upload is the telephone ID, which is mandatory due to potential law infringements. 26Again, this methodological decision bears some risks regarding potential misuse in form of offensive content and will inevitably lead to some uploads that do not comply with our scientific and ethical standards (e. g., insulting comments or pornographic photos).These uploads have to be deleted from the server as quickly as possible.Therefore, we constantly check all uploaded photos for inappropriate material.Furthermore, users have the possibility to report misuse directly within the app.We can assume, though, that the majority of uploads will be thematic and appropriate, so that the open-access solution seems viable for this kind of crowdsourced task.However, since offensive signs and lettering have to be considered part of the linguistic landscape (e. g., xenophobic graffiti or politically controversial stickers), they will be stored in the database, but not be made publicly available in the app. 27An alternative way of addressing this problem would have been to use a personalized log-in for app access together with a moderation of the uploads before publication.However, in our view, trusting the crowd by making access to and interaction with the app as easy as possible seems to be the more promising way to go, because it a) does not demand personal information from the users and b) does not prevent the users from an instant confirmation of his/her interaction with the app.Both reasons would likely keep many users from using the app.

Categorization of the collected data for analysis
Since Lingscape pursues a crowdsourcing approach to the study of linguistic landscapes, one of our main concerns is to develop an application that is easily accessible to our users.Therefore, we employ only very few analytical descriptors during the upload process (location, visible languages).In contrast, LinguaSnapp uses a comprehensive set of (non-obligatory) analytical descriptors for data collection (cf.Gaiser/Matras 2016).This approach has the advantage that at least some part of the collected data will already be pre-analyzed when uploaded to the server and therefore offers the possibility to gather well-structured data.However, in our view, the very basic approach taken in the Lingscape project may have two crucial advantages compared to using a differentiating set of descriptors: First, the app is very easy to use for nonexperts, which makes it more likely for a large number of citizens to participate in the data collection.For example, we are currently testing an implementation of the app as teaching tool for multilingualism in primary schools.For this type of use, an easy data model is essential.
And second, we do not force the collected photos into a scientific analytical framework, which might be helpful for data analysis, but inevitably shapes what type of data can be collected by the users, and therefore analyzed by us.
However, by asking the users to indicate the visible languages on their photos, we will be able to directly address and analyze one special characteristic of the Luxembourgish linguistic landscape: Due to the large amount of structural interferences and lexical borrowing between Luxembourgish on the one hand and German and French on the other hand, it will be difficult in some cases to categorize a lettering, e. g., if the words are identical in both Luxembourgish or French.Especially for place names, there is often no equivalent word in all three languagesor they are written the same in Luxembourgish, German, and French -, which may lead to a linguistic hybridization on some signs.Asking the users to indicate visible languages on a sign or lettering basically means that we can document and analyze their perceptions of multilingualism in public space, which will allow very interesting insights into the different ways in which citizens conceptualize the semiotic space that surrounds them.28

Figure 3. Process model for user interactions and data processing
Regarding the classification and analysis of the collected data, we will annotate the photos using common sets of analytical descriptors for linguistic landscaping purposes (cf.Backhaus 2007, Ben-Rafael et al. 2006, Scollon/Scollon 2003).Still, our main goals of data processing will be a) the (automated) text extraction and analysis and b) the geostatistical mapping of both the extracted texts and the participants' language annotations.Figure 3 illustrates the whole process model, including data collection and processing, as well as the user interactions with the app (upload, map) and the sharing/reporting features.The concrete procedure and cateogories for data analysis will be decided on subject to the quality and quantity of the collected data.Specific research questions will concern the presence and regional distribution of Luxembourgish, Portuguese, English, and Italian in public space, code preference and hierachization in multilingual signs, and peoples' perceptions of multilingual lettering.

5.7
Legal obligations due to copyrights and property rights Since public buildings as well as signs and lettering represent works (of art) that are protected by copyrights, trademark rights, and property rights, we have to account for these legal frameworks in the context of Lingscape.Every user therefore has to agree to our terms and conditions, in which these issues are addressed directly.As for the user's copyright regarding his/her uploads, the approval of the terms and conditions includes the permission for the Lingscape team to publish and analyze the uploaded photos, assuming that the user is in possession of all the necessary rights to do so.Regarding the trademark and property rights (and also the copyrights of the sign authors/designers), we can refer to the "freedom of panorama" principle for persistent works in public space as it is stated in national and international Copyright Acts.29 Since we are pursuing commercial use the photos, taking photos of signs and lettering in public space should not pose a substantial problem to data collection.Still, there are many countries in which the freedom of panorama is limited or does not even exist.Therefore, the terms and conditions have to make sure, that a) our data collection only pursues scientific goals, b) the users grant us the rights to publish online and analyze their contributions, and c) the users are aware of the specific legal frameworks that may apply for taking photos in public space in their countries.In order to make sure that the terms and conditions match legal standards, we have developed a comprehensive terms & conditions section together with law experts from the University of Luxembourg.

Data presentation and analysis
At this point, an accessible way of data presentation is already part of the research design via the in-app map viewer.However, we are developping more advanced ways of data visualization like the web-based map analysis tool and a comprehensive research database for data annotation and analysis in the second project phase.One main goal to pursue in this regard is the implementation of computational analysis methods for huge amounts of data using techniques from research on visual recognition and OCR in order to extract and process the on-sign texts.Still, many photos will have to be analyzed manually due to bad image quality, non-readable typography, and text corrections or other content modifications.Also, since we have to rely on our users regarding the geolocalizing of the photos, many defined locations will not be as exact as desirable.Therefore, an automated linguistic analysis and geostatistical mapping of all collected photos will be part of the second project phase.This will include cumulative heat maps illustrating local and regional language dominance as well as linguistic and typographic content analyses regarding the hierarchization and copresence of language patterns on signs.For example, we expect the north of Luxembourg to contain a higher percentage of Luxembourgish and German signs, while in the south French and Portuguese should be more prominent in the landscape due to the historically grown region-specific settlement structures.Beyond that, inevitable double entries for specific signs can easily be detected and eliminated from the analysis by dint of software processing.

Conclusion and outlook
This paper introduces the research and teaching tool Lingscape, describes its basic functionalities, and outlines the potential for research as well as methodological challenges of a crowdsourcing approach to the study of linguistic landscapes.Linguistic research with the help of smartphone applications has a great potential for attracting citizens and promote their engagement in research activities, especially in a multilingual society like Luxembourg.In doing so, our pilot study in Luxembourg will not only provide new evidence for the benefits of crowdsourcing methods in linguistics, but will also help to foster the citizens' social awareness for the specific challenges and opportunities of multilingual communities.Some of the technical and methodological decisions taken during the development process (no personalized log-in, ex-post moderation of uploads, few analytical descriptors) bear a certain risk of misuse by the users, or at least could lead to messy data.Our pilot study in Luxembourg will therefore also reveal whether or not this methodology represents a feasible approach to crowdsourcing as a) a resource for research on linguistic landscapes and b) a survey method in the context of a citizen-science project (see Purschke forthcoming).