Mobile data project
Detailed map
Data comparison dashboard
Municipalities’ dashboard
Our mobile data study examines the movements of foreign travellers in Lithuania using big data. This data reveals aspects of foreigners' journeys that have not been explored in such detail before, such as the routes with the highest flow of foreigners, the most visited destinations, their popularity among travellers from different countries, the duration of the visits, and more.
The project's results are intended for all stakeholders in the tourism ecosystem. The data can be used to improve the experience of foreign tourists in Lithuania and assess the need for investment in tourism services and products, as well as in enhancing the existing tourism infrastructure. The tourism statistics presented in the detailed maps may be particularly relevant in regions of Lithuania where data supply has been insufficient.
3 ways to use these maps:
- General data in the municipalities’ dashboard – for analysis of general trends and indicators
- Daily data in the municipalities’ dashboard – to assess the results of holidays and events
- Visitor arrivals on the detailed map – to analyse visitor flows to tourist sites, to plan the development of tourist sites and services as well as infrastructure upgrades
Data: calculated from depersonalised metadata of mobile users.
Time period: May-October 2021, April 2022-August 2023.
Implementation: The pilot project was carried out by Lithuania Travel in cooperation with Positium, an Estonian company that has been developing mobile big data analytics methodologies for almost 20 years, and Lithuanian companies Tele2 and Positin.
Frequently Asked Questions (FAQs)
- What statistical data is used to monitor the travel patterns of foreign visitors?
-
Here is the data monitored:
- Number of unique visitors – an indicator used to measure the number of foreign visitors who have visited the selected area during the selected period. A foreign visitor who has visited the selected area more than once during the selected period is considered as 1 unique visitor.
- Number of trips – an indicator for measuring the number of trips made by all unique visitors who have been present in the selected territory during the selected period. One trip is defined as all places visited by a foreign visitor from the moment of arrival in the selected territory (first connection to the Lithuanian mobile network) to the moment of departure from the selected territory (last connection to the Lithuanian mobile network). If there is a period of more than 48 hours between the mobile metadata records of one unique visitor (the visitor has been absent from Lithuania for more than 2 days), then it is considered that 1 foreign visitor has made 2 separate trips.
- Number of stays – an indicator to measure the number of times all unique visitors have stopped at least for a short period of time in a selected area and time. A single stay is defined as a place where a foreign visitor has been recorded at least once, regardless of the duration of the visit. This indicator helps to identify the maximum tourism potential, as it shows both transit and regular visitors. If a foreign visitor has been recorded 3 times in the selected area during 2 separate trips in the selected period, 3 visits are considered to have been made. This indicator, together with the number of unique visitors, can be used to calculate the average frequency of visits by foreign visitors to the selected area (number of visits divided by the number of unique visitors).
- Number of one-day visitors – an indicator used to measure the number of unique visitors to the selected territory who did not use accommodation services (did not spend the night in the selected territory). If a foreign visitor has visited the selected territory more than once outside the overnight hours during the selected period and in the selected territory, then, irrespective of the number of journeys or visits made by the visitor, it is considered that there has been only 1 unique one-day visitor.
- Number of first stays – an indicator to measure the number of times foreign visitors started their trips in the selected territory. The location of the first visit is determined by where the foreign visitor first connected to the Lithuanian mobile network. There can be only 1 first visit point per trip. This indicator helps to identify where foreign visitors usually start their journeys in Lithuania. Please note: foreign visitors do not always connect to the Lithuanian mobile network as soon as they cross the state border, so this indicator is not a good measure of the number of cross-border travellers.
- Number of last stays – an indicator to measure the number of times foreign visitors have completed their trips in the selected territory. The location of the last visit is determined by where the foreign visitor last connected to the Lithuanian mobile network. There can only be 1 last visit point per trip. This indicator helps to identify where foreign visitors' trips to Lithuania usually end. Please note: foreign visitors do not always disconnect from the Lithuanian mobile network at the state border, so this indicator is not a good measure of the number of cross-border travellers.
- Number of distinct calendar days spent by foreign visitors – an indicator designed to measure the amount of time (in calendar days) spent by all foreign visitors in the selected territory during the selected period. This indicator, together with the number of trips, helps to determine the average length of time spent by foreign visitors in the selected territory (number of calendar days spent divided by the number of trips).
- Nights spent – an indicator to measure the number of nights spent by all unique visitors in the selected territory during the selected period. The number of nights is determined if a foreign visitor was in the selected territory at 4 am.
- What data was used for the statistics on foreign visitors?
-
The foreign visitor statistics were determined using depersonalised inbound mobile positioning data (MPD) from foreign roaming mobile users of the mobile operator Tele2. This metadata is generated whenever a mobile user's device connects to the mobile network, such as when making or answering a call, sending or receiving a text message, or using mobile data. The metadata includes information on the mobile tower to which the user has connected, the approximate direction from the mobile tower to the user's location, the date and time of connection, and other information relevant for mobile service accounting. Importantly, this metadata does not contain any information about the content sent or received by mobile users. Mobile users' metadata is stored in secure databases of the mobile operator for a maximum of six months.
- How is the data used to compile statistics on foreign visitors collected, stored and processed?
-
For the purpose of obtaining statistics on foreign visitors, data on mobile users is firstly depersonalised by encrypting the mobile user's telephone number and other personally identifiable information. Only the mobile operator knows how the data has been encrypted and it is not possible to reconstruct the data and thus identify the individual. In addition, all the metadata used for statistics is aggregated for each area analysed. If there were fewer than ten unique foreign visitors in the selected territory during the selected period, then such data is not used. This eliminates the possibility of identifying the travel itinerary of a particular person. These data cleansing procedures are carried out using automated algorithms (without the direct involvement of specialists) on a secure server controlled by the mobile operator, to which only authorised persons have access. For this reason, all mobile data analytics are carried out according to the highest standards of data processing for mobile operators. Only the final foreign tourist statistics are downloaded from the mobile operator's servers.
In addition to mobile user data, other open data sources are also used for foreign visitor statistics, such as data on land use and use of buildings (residential, commercial, industrial), road and street network, coordinates of seaports, airports, border checkpoints, transit corridor zones, passenger statistics from seaports and airports, and official tourism statistics.
- For what purposes can statistics on foreign visitors be used?
-
The Lithuania Travel foreign visitor statistics, determined using aggregated depersonalised metadata from mobile users, aim to enhance the experience of foreign travellers during their stay in Lithuania. These statistics are intended to develop new tourism services and products, particularly in Lithuania’s regions, and to monitor and optimise the impact of promotional measures. Detailed information on which places visitors frequent most in Lithuania will help tailor tourism infrastructure to actual needs, such as providing tourist information in relevant foreign languages and clearer signage. Additionally, monitoring changes in foreign visitor flows to specific destinations will assist in offering additional services or products, like setting up new sales points or expanding the accommodation network.
Statistics on foreign visitors will aid in assessing the return on investment (ROI) and identifying specific needs for investment in the tourism sector. The project’s results will also allow for monitoring which tourism promotion measures, such as marketing campaigns and investments in public infrastructure improvements, have the greatest impact, enabling optimisation for maximum effect.
As the foreign visitor statistics are an open dataset available to all Lithuanian citizens, there is potential for wider use of this data for education, raising awareness of the Lithuanian tourism sector, and other purposes.
- On what basis was data from mobile phone users used to compile statistics on foreign visitors?
-
The foreign visitor statistics are compiled in accordance with applicable legislation and the highest standards of mobile data analytics and ethics.
This pilot project is considered a research study that adapted the international methodology for compiling tourism statistics based on mobile data to Lithuania. The results of this study will be presented at various scientific conferences.
In accordance with Regulation (EU) 2016/679 of the European Parliament and of the Council on the protection of natural persons regarding the processing of personal data and on the free movement of such data (General Data Protection Regulation, GDPR), further processing for archiving purposes in the public interest, for scientific or historical research purposes, or for statistical purposes is considered compatible and lawful processing. The pilot project for compiling statistical indicators on foreign visitors adhered to GDPR requirements and other relevant legislation of the European Union and the Republic of Lithuania. Recommendations and guidelines from the State Data Protection Inspectorate of the Republic of Lithuania, including the "Methods of Depersonalisation" (2015), were applied to the analysis of mobile user data, employing pseudonymisation, aggregation, and k-anonymity methods.
Finally, the project partners followed publicly available ethical standards when compiling statistics on international visitors, including:
- Positium – Guidelines on Data Safety and Ethics.
- Tele2 – Code of Conduct.
- Why was metadata from mobile users used to compile statistics on foreign tourists?
-
Mobile user metadata is a reliable alternative source of tourism data, complementing other traditional official statistics such as immigration data, tourist surveys, and accommodation statistics. This metadata, collected and used regularly for mobile phone accounting and other legitimate purposes, is repurposed for statistical research without causing any additional inconvenience to tourists. Additionally, it reduces the cost of traditional statistical compilation methods.
The experience of the Bank of Estonia, which has maintained tourism statistics using mobile data since 2009, shows that compiling statistics with existing mobile data sources is 2.5 times more cost-effective than traditional methods. With over 5.2 billion people (about 62% of the global population) using mobile services in 2020, mobile data offers a sample size at least 200 times larger than traditional statistical methods. This enables the compilation of foreign visitor statistics at near real-time intervals (daily, weekly, monthly) and at a very high level of detail, based on the coverage areas of mobile towers. Such statistics can also be gathered for remote areas where traditional methods are impractical. Moreover, mobile data can identify travel patterns of foreign visitors within a selected area, from their point of entry to their point of departure, a capability not provided by traditional statistical methods.
- Who has access rights to the mobile data used to compile statistics on foreign visitors, and what are those access rights?
-
The metadata of mobile users used to compile statistics on foreign visitors is collected and stored on the infrastructure of the mobile operator Tele2. This data is depersonalised by encrypting personally identifiable information. Positium, an Estonian company specialising in generating statistics from mobile data, further analysed this depersonalised metadata. Positium has been developing these methods in collaboration with the University of Tartu for almost 20 years.
Positium's software, Positium Data Mediator, is deployed within Tele2's technical infrastructure to clean and aggregate mobile data, and produce intermediate statistics. Using these intermediate statistics, Positium applied a mathematical model, incorporating additional data (such as seaport and airport passenger statistics, accommodation statistics, geographical data on land use, and road networks) to produce weighting factors for each territory analysed. This process ultimately compiled comprehensive statistics on all foreign visitors to Lithuania.
Throughout this entire process, only authorised specialists from Tele2 and Positium, responsible for necessary actions such as software installation, calibration, and algorithm execution, had secure access rights to the mobile user data.
The statistical data on foreign visitors (including maps) is stored on different hardware than the original mobile user data. The main dashboard, displaying foreign visitor rates by municipality and county, is hosted on a server administered by Tele2. Detailed maps are hosted on servers administered by Esri within the European Union using ArcGIS Online software.
Lithuania Travel and PositIn do not have any access rights to the mobile users' metadata and only use the final statistical data of foreign visitors.
- Can I trust statistics on foreign visitors?
-
The Positium Data Mediator (PDM) software, developed by the Estonian company Positium, is used to compile foreign visitor statistics based on the metadata of mobile users. Aiming to transform mobile big data into understandable and reliable data, Positium has been developing this software and its underlying methodology in collaboration with the University of Tartu since 2004. The data model and statistical indicators used in this methodology are based on years of global experience and academic excellence.
The sophisticated PDM methodology aims to transform large mobile datasets into a reliable data model that accurately reflects people's behaviour in space over time, while limiting the accuracy only to the quality of the mobile data. Positium's experience with national statistical offices, public authorities, ministries of tourism and planning, municipalities, spatial planners, data businesses, and international organisations (including the United Nations World Tourism Organisation and the United Nations Global Working Group on Big Data for Official Statistics) ensures that the PDM methodology aligns with the highest standards of global statistical production.
Calculating potential margins of errors in foreign visitor statistics is complex and heterogeneous. Producing statistics using mobile phone metadata differs from traditional statistical methods (e.g., surveys), which have clearly defined respondent samples and relationships with the total sample (in this case, all foreign tourists visiting Lithuania). This pilot project has not yet identified the most appropriate factual data to clearly pinpoint potential biases in foreign visitor statistics (e.g., official accommodation statistics do not include data on foreign visitors staying in unregistered accommodation). Project partners continue to work with Lithuania Travel to verify the accuracy of foreign visitor statistics, for example, by comparing the total number of participants in specific events in Lithuanian regions with the statistical data.
To reduce potential errors in foreign visitor statistics, the initial metadata of mobile users was first cleaned by:
- Removing mobile user data used in hardware (e.g., GPS tracking of vehicles)
- Removing duplicate mobile user data (e.g., when a user carries more than one mobile device)
- Removing random mobile user data (e.g., data from passing vessels)
- Removing data of users who accidentally connect to the Lithuanian mobile network at the border
- Removing data of users in transit through Lithuania
- Identifying mobile users who live and work in border regions
- What additional data was used to compile the statistics on foreign visitors?
-
Many additional geographical and statistical datasets were used to compile the statistics on international visitors. Specifically, the following data was used: geographic coordinates of mobile phone towers, geographic data on land use and building use (residential, commercial, industrial), data on the road and street network, geographic coordinates of seaports, airports, and border crossings, and data on transit corridors. These datasets were used to create the most accurate geographic grid of Lithuania, from which mobile phone users' data was aggregated.
Other official statistics were employed to create weighting factors for each territory analysed, including passenger statistics for sea and airports, migration data, and publicly available attendance data for individual events.
- Is my data safe?
-
Yes, your data is safe. All measures have been taken to ensure that no individual can be identified when compiling statistics on foreign visitors. It is not possible to make decisions about a specific person based on the foreign visitor statistics, and therefore this data cannot be used for direct marketing or other direct impact purposes. The compilation of foreign visitor statistics has been carried out in compliance with higher requirements for the protection of personal data than those laid down in the General Data Protection Regulation (GDPR) of the European Union and the legislation of the Republic of Lithuania. Additionally, the partners of this pilot project (Positium, Tele2, and PositIn) are committed to the highest data protection and ethical standards.
- Who else uses mobile user metadata?
-
Mobile user metadata is collected, stored, and used by each mobile operator to keep track of mobile services, including the number and duration of calls, the number of text messages sent and received, and the amount of data transferred. Each time a mobile user uses mobile services or travels and connects to different mobile towers, a record of the mobile user's action is automatically created in a dedicated and well-secured database of the mobile operator. Each record contains an accurate record of the date and time of its creation and the mobile tower to which the user was connected. Besides recording mobile usage, these records are used for the analysis and management of mobile infrastructure to ensure the smooth operation of the mobile network. By knowing which mobile tower users are connected to, the network can interconnect them effectively. Mobile users' metadata is also used for other legitimate purposes, including ensuring public safety in certain cases.
In other countries, mobile user metadata is used for various statistical indicators, such as determining population size and calculating population movements between different cities or areas. These statistical indicators are increasingly used in transport planning (e.g., public intercity transport routing), spatial planning (e.g., master plans of cities), civil protection (e.g., planning and executing evacuation operations), and commercial research (e.g., identifying the optimal location for new retail and service facilities or real estate projects).
- Is it possible to identify personal data from foreign visitor statistics?
-
No, it is not possible to identify personal data from foreign visitor statistics. All metadata from mobile users used to compile these statistics was depersonalised by encrypting information (e.g., phone numbers) that could identify a specific person before analysis. Additionally, all depersonalised data was aggregated (summed up for each territory) and extrapolated using additional data sources. If the aggregated data consisted of fewer than 10 unique mobile users in a given area, such data was not used. These methods ensure that foreign visitor statistics do not contain information that could identify the location of a particular person within the selected territory or reveal which other territories were visited by the same individual. Well-known tools have been applied to foreign visitor statistics to ensure that, even when combined with other data, attempts to identify a particular individual are impossible or exceedingly difficult.
- Why was data from only one mobile operator used to compile statistics on foreign visitors?
-
At the start of this pilot project, a survey of mobile operators was conducted to identify the mobile data indicators most relevant for compiling international visitor statistics, including market size, the number of roaming contracts with mobile operators in other countries, and the quality of metadata on mobile users.
The study analysed data from the mobile operator Tele2. Tele2 was found to have a sufficient market size and number of contracts with mobile operators in other countries to compile the statistics. Additionally, the quality of the data collected and stored by Tele2 met the data quality requirements for statistical production.
The methodology used to compile the statistics on international visitors, developed by the Estonian company Positium in cooperation with the University of Tartu, has the potential to combine data from multiple mobile operators. However, such a model requires much greater administrative, technological, and methodological effort. Even if data from all mobile operators were used, some foreign visitors would still not be included in the statistics because they do not use mobile services (e.g., switching off their phone when travelling or purchasing local mobile services). The extrapolation mathematical model developed by Positium was used to estimate the statistics of foreign visitors who did not use roaming mobile services. Therefore, data from one mobile operator was sufficient to generate the statistics for foreign visitors.
- How long is the data of mobile users stored for the calculation of foreign visitor statistics?
-
The mobile operator Tele2 collects and stores mobile user metadata, in particular that used for the accounting of mobile services, for a maximum period of 6 months on secure hardware under its management. For the purposes of this pilot project, mobile user' data were depersonalised by encrypting personally identifiable information and aggregating the data of all mobile users in each of the areas analysed. The statistics thus produced cannot be used to identify a specific individual and therefore have no time limit for their storage.
- Is it possible to identify the route taken by a particular foreign visitor?
-
No, it is not possible to identify the route taken by a particular foreign visitor. The statistics on foreign visitors only reveal general patterns in the movements of many visitors, such as identifying areas where the highest number of foreign visitors are first detected (points of arrival). These statistics were compiled using depersonalised and aggregated data on mobile phone users. Additionally, data from areas where foreign visitor statistics consist of fewer than 10 unique visitors are not included in the calculations. For these reasons, it is not possible to identify the travel itinerary of an individual based on foreign visitor statistics.
- Are there alternative ways to calculate foreign visitor statistics without using data from mobile operators? Are these alternative methods used?
-
There are several traditional methods for compiling statistics on foreign visitors, such as immigration data, accommodation data, maritime and airport passenger statistics, and visitor surveys. However, using metadata from mobile phone users to compile foreign visitor statistics is four times faster than the survey method and covers a sample of foreign visitors more than 200 times larger. Additionally, mobile data can identify the number of foreign visitors from a list of countries more than 12 times larger than the survey method. Based on the experience of the Bank of Estonia and other countries, compiling statistics on foreign visitors using mobile data is 2.5 times more cost-effective than traditional methods. Furthermore, mobile user metadata allows for the production of statistics that would be impossible or difficult to collect using traditional methods, such as the number of day visitors and the number of overnight stays in non-registered accommodation