Hostname: page-component-78c5997874-xbtfd Total loading time: 0 Render date: 2024-11-11T05:28:09.632Z Has data issue: false hasContentIssue false

The French official statistics strategy: Combining signaling data from various mobile network operators for documenting COVID-19 crisis effects on population movements and economic outlook

Published online by Cambridge University Press:  24 June 2021

Elise Coudin*
Affiliation:
Institut national de la statistique et des études économiques, Montrouge, France
Mathilde Poulhes
Affiliation:
Institut national de la statistique et des études économiques, Montrouge, France
Milena Suarez Castillo
Affiliation:
Institut national de la statistique et des études économiques, Montrouge, France
*
*Corresponding author. E-mail: elise.coudin@insee.fr

Abstract

During the COVID-19 crisis, the French National Institute of Statistics and Economic Studies (INSEE) used aggregated and anonymous counting indicators based on network signaling data of three of the four mobile network operators (MNOs) in France to measure the distribution of population over the territory during and after the lockdown and to enrich the toolbox of high-frequency economic indicators used to follow the economic situation. INSEE’s strategy was to combine information coming from different MNOs together with the national population estimates it usually produces in order to get more reliable statistics and to measure uncertainty. This paper relates and situates this initiative within the long-term methodological collaborations between INSEE and different MNOs, and INSEE, Eurostat, and some other European national statistical institutes (NSIs). These collaborations aim at constructing experimental official statistics on the population present in a given place and at a given time, from mobile phone data (MPD). The COVID-19 initiative has confirmed that more methodological investments are needed to increase relevance of and trust in these data. We suggest this methodological work should be done in close collaboration between NSIs, MNOs, and research, to construct the most reliable statistical processes. This work requires exploiting raw data, so the research and statistical exemptions present in the general data protection regulation (GDPR) should be introduced as well in the new e-privacy regulation. We also raise the challenges of articulating commercial and public interest rationales and articulating transparency and commercial secrets requirements. Finally, it elaborates on the role NSIs can play in the MPD valorization ecosystem.

Type
Translational Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2021. Published by Cambridge University Press

Policy Significance Statement

COVID-19 crisis confirmed that mobile phone data can provide information on the population present in a given place and at a given time, and on mobility analyses, useful for public decision-making not only in times of epidemics. However, indicators provided by the data analytics solutions of the different mobile network operators (MNOs) showed some shortcomings as regards to official statistics standards and needs. This calls for stronger collaborations between official statistics and MNOs, to construct a fruitful partnership in which commercial and public uses are articulated in a secured legal environment using privacy-enhancing technologies.

1. Introduction

Data passively generated by mobile networks have emerged as a valuable data source for studies of human presence, mobility, and social interactions (Blondel et al., Reference Blondel, Decuyper and Krings2015). Given that they provide precise and up-to-date information, these data are of great interest for public decision-making. The French National Institute of Statistics and Economic Studies (INSEE) together with Eurostat and some other European national statistical institutes (NSIs) identified some years ago their potential for official statistics, as a complement to more standard statistical data sources. These NSIs have developed frameworks to integrate mobile phone data (MPD) into statistical production, but practical experiments remain rare (DGINS, 2013; Debusschere et al., Reference Debusschere, Sonck and Skaliotis2016; Ricciato et al., Reference Ricciato, Widhalm, Pantisano and Craglia2017; Ricciato et al., Reference Ricciato, Lanzieri, Wirthmann and Seynaeve2020; ESSnet Big Data WP5, n.d.). In parallel, INSEE, as a data producer, has initiated methodological collaborations with the R&D departments of different French mobile network operators (MNOs) in order to confront MPD with that of official statistics standards as well as to develop innovative experimental statistics (Vanhoof et al., Reference Vanhoof, Combes, de Bellefon, Petrucci and Verdec2017; Sakarovitch et al., Reference Sakarovitch, de Bellefon, Givord and Vanhoof2018; Vanhoof et al., Reference Vanhoof, Reis, Ploetz and Smoreda2018).

At the beginning of the COVID-19 lockdown, with a view to measuring the large movements of people that occurred just before the confinement came into force, INSEE quickly initiated limited-in-time and specific collaborations with three of the four MNOs operating in France. INSEE’s strategy was to combine aggregated and anonymous indicators on population movements and mobility provided by those MNOs, as well as to keep control over the final statistical treatments in order to rapidly disseminate these experimental statistics as reliably as possible. This paper relates this initiative.

This MNO project was part of a wider strategy to provide statistics on crucial issues such as the economic turndown and excess mortality (see Chief Statistician J. L. Tavernier’s post on the INSEE blog; Tavernier, Reference Tavernier2020). This experience has indeed confirmed that MNO data do present a strong interest for public decision-making and for official statistics more generally. However, more collaborations between statistical institutes and MNOs are needed to increase relevance of and trust in these data, in particular, through methodological improvements. Our experience has shown that MNOs do have an important role to play alongside NSIs, in the production of valuable information that serves the public interest.

2. Setting the Scene

France, like many countries affected by SARS-CoV-2, took strict restrictive measures in March 2020 to contain the circulation of the virus. Within 4 days (March 14–17), schools and nonessential businesses were successively closed and the population was placed under lockdown nationwide for almost 2 months (March 17–May 11); see Salje et al., Reference Salje, Tran, Lefrancq, Courtejoie, Bosetti, Paireau, Andronico, Hozé, Richet, Dubost, Le Strat, Lessler, Levy-Bruhl, Fontanet, Opatowski, Boelle and Cauchemez2020; Legifrance, 2020. Just before the lockdown, major population movements took place, leading to an unknown redistribution of the population over the territory, which had to be documented in the context of the management of the health crisis. This prompted INSEE to demand access to contemporary MPD to provide nationwide information on population distribution that could be used in addition to more traditional residence-based population statistics (census and administrative data).

INSEE offered, during the very first weeks of lockdown, to the four main mobile phone operators operating in the country, a collaboration specific to the health crisis. The topics of collaboration were clearly specified since the beginning: documenting the population distribution across the territory that should differ from the distribution of people in their usual places of residence, its evolution when the lockdown would be released; and providing mobility indicators—especially commuting indicators—to enrich the toolbox of high-frequency indicators used in the INSEE economic outlook.

As the NSI, INSEE should ensure the compliance of the statistical products it disseminates with the principles of official statistics, that is, professional independence, objectivity, impartiality, relevance, quality, and sound methodology as established by national and European regulations (National Statistical Law and n.d. Regulation (EC) No 223/2009). This is especially challenging when the Institute is not the direct collector of data and when the data are not primarily collected for statistical purposes. In front of the emergency and the massive nature of the socioeconomic shocks that had to be measured, it was not feasible to address directly in time, all the methodological aspects needed to ensure official statistics standards were respected. INSEE used anonymous indicators produced by the data analytics services of the MNOs based on network signaling data in accordance with the Directive 2002//EC (the ePrivacy Directive; EUR-Lex, n.d.). Having no control on the methodology used to construct the indicators, INSEE strategy relied on combining data products already available or easily achievable coming from various MNOs—to mitigate risks of biases, or other consistency issues (Batista e Silva et al., Reference Batista E Silva, Carneiro Freire, Schiavina, Rosina, Marín Herrera, Ziemba, Craglia, Koomen and Lavalle2020). INSEE chose also to keep in hand the final calibration of the results.

Three MNOs responded favorably to the proposal and engaged time-limited philanthropic collaborations for the sole context of the sanitary crisis. Confidentiality agreements were established to frame the delivery and use of aggregates respecting GDPR. Each collaboration had its specificity, concerning the geographical and time ranges, the purposes, and the range of the philanthropic collaboration. At the same time, some MNOs were solicited by various public bodies and also felt the need for a (partly) centralized response on which INSEE could participate. This lasted until the end of May. Since then, INSEE has not engaged commercial partnerships.

INSEE had already ongoing methodological collaborations for the construction of experimental statistics on population present within a given place and time with two MNOs. The collaborations with these MNOs were faster to launch. INSEE participation in the European Task Force on the use of MPD for official statistics and its role as a usual methodological partner of telecommunication operators have accelerated the design of the necessary data and their processing.

3. Providing Input Data to a Data/Statistical Producer

The three operators who collaborated with INSEE had data analytics units exploiting network signaling data. They provided statistical products coming from their commercial offers, namely aggregated and anonymous population-adjusted counts of people, department of presence (NUTS3) during the night by department of residence; and origin–destination trip count matrices. Only one operator accepted the production of a specific product for INSEE. With INSEE objective to combine information, the use of already existing statistical products raised a first problem of compatibility between the concepts, measures, and methods that would depend on the MNO choice.

INSEE core competency is to process, produce data, and make it understandable. So, INSEE needed the least preprocessed data possible in order to adjust the results thanks to its own data sources and to perform specific statistical treatments (the lockdown situation allowed one to assume no entry/exist in the French territory, justifying to recalibrate the overall population counting to national population estimates). With these post-treatments, INSEE went further than a direct use of the products provided by MNOs. However, many indicators received by INSEE were already statistically adjusted to the whole population following MNOs’ methodologies, and this hinders INSEE from getting the most out of the combination of sources. NSIs have the legitimity and a unique capacity to pull together various data sources inter alia to implement relevant statistical adjustment in order to identify and correct sample bias. MNOs do not share this capacity, and their datasets are limited by definition to their customer database, which are not representative of the whole population. Since there is a public interest to improve the quality of the data used to produce new public or private services, it would be interesting to further study under which terms and conditions, NSIs could produce and provide ad hoc anonymous information needed by operators to correct more accurately MNOs indicator sample biases.

The transmission of nonadjusted data to NSIs: issues for MNOsFootnote 1

Nonadjusted data can reveal the respective market shares of the competing MNOs according to variable geographical scales. If they are required to provide to the regulator their market share at the national level, MNOs are reluctant to share with NSIs their local market knowledge—expressing fears it could leak to competitors. A solid guarantee of confidentiality drawn up between NSIs and operators is therefore crucial.

The MNOs made quite regular deliveries of the data during the crisis: from a single delivery for one MNO to more than 10 deliveries for another one. Some operators agreed to exchange on a very regular basis on the methodology of computation of the indicators, making their methodologists available to answer our questions and even sometimes to adjust the calculation method. However, even in the case of very regular exchanges, the methodology had never been made fully transparent. On the one hand, operators considered that fully exposing their methodology carries the risk of revealing their technical innovations. This issue is crucial and legitimate. On the other hand, INSEE must respect its commitments in terms of deontology and standards. In particular, INSEE needs to measure the quality of the data it produces and to communicate its reliability to the public. The balance between these two requirements has yet to be found.

The transparency of the methodology: issues for MNOs

INSEE recommends that common methodological standards should be adopted (definition of what is a place of residence, what is commuting, etc.). Nevertheless, sharing this methodology would raise at least two problems in their views: it could reveal their innovations, and it would require them to adjust their statistical production line. The latter is costly given the computational burden in a Big Data ecosystem, but also because a new algorithm can require the Data Protection Authority approval.

4. Outcomes and Impacts

At the announcement of the lockdown, significant population movements were observed. These changes raised concerns about pressures on local health systems and therefore had to be quantified. In addition, a better understanding of who had changed residences (workers, second homeowners, and students) was also of interest for decision-makers and for the public information. After the lockdown, the gradual return of population movements was also measured, and the return of population, especially nonresidential workers, within the big cities was also related to economic activity rebound in the country. INSEE published two press releases. The first one, published 3 weeks after the lockdown announcement, provided first experimental results on the population distribution before and during lockdown at the department level (NUTS3) (see INSEE Press Release of April 8, n.d.). It gave first measures of the population movements that happened—tourists, workers, and students returning to their homes, and urbans moving to more rural areas. It relied on data coming from one MNO only, and results were announced to have to be confirmed by a later cross-MNO analysis. In addition to the press release, INSEE also communicated these experimental statistics to prefects in the sanitary crisis management context.

The second press release (INSEE Press Release of May 18, n.d.) consolidated the first results with data coming from two MNOs and covering a longer period (up to the end of April). An econometric-based approach was performed to combine MNOs’ indicators, with final population adjustment made by INSEE. The second press release was accompanied by a detailed analysis report, which compared the signaling data-based population distribution over the territory before and after lockdown, to census and other official statistics data sources-based descriptions of the territories the most likely to host mobile groups—students, young adults, and owners of secondary residences—and the most likely to show population changes at the lockdown ease. The lockdown was gradually and partially lifted on May 11. The econometric approach adopted enabled also to deliver messages about the inherent uncertainty of daily signaling data-based indicators and to compare the population variations before and after lockdown to usual weekly changes.

The third communication was published in the INSEE collection dedicated to summaries of research studies conducted by the Institute for large audience (see Galiana et al., Reference Galiana, Castillo, Sémécurbe, Coudin and de Bellefon2020). This study relied on data coming from three MNOs, and covered the whole period from prelockdown to the first phase of lockdown ease (up to the end of May). As for the second press release, it used an econometric approach to combine indicators coming from the three MNOs, and population adjustment of the MNOs’ counts were performed by INSEE (even when the indicators were already population-adjusted by the MNOs). It consolidated the previous findings and focused on what happened at the end of the lockdown period: weekly movements between urban centers during the week and more rural and coastal departments on weekends increased.

In addition to population movements, INSEE used daily morning mobility indicators as proxies for commuting indicators to shed light on the pace of the economic recovery starting from the ease of the lockdown. This study relied on daily origin–destination matrices at a fairly fine geographical grid provided by only one MNO. This analysis of morning commuting was published in the Economic Outlook of June 17, 2020, along with other high-frequency data indicators (INSEE, 2020). This analysis relied on the same dataset as the one used in INSERM–Orange lab study that characterizes mobility to inform an age-structured stochastic transmission model and evaluates the impact of the lockdown in curbing COVID-19 epidemic (Pullano et al., Reference Pullano, Valdano, Scarpa, Rubrichi and Colizza2020).

Aside from disseminations of statistical results, INSEE also published a post on its blog explaining its strategy and practical approach regarding MPD (Sémécurbe et al., Reference Sémécurbe, Suarez Castillo, Galiana, Coudin and Poulhes2020). For private life secrets and privacy concerns, and in a context when all kinds of MPD were perceived by the public as a potential risk of tracing, which legitimately raises privacy concerns, it was necessary to explain what—anonymous counting aggregates—was used by INSEE, what for, and how the current work specific to the sanitary crisis articulated with the INSEE long-term strategy and methodological works, often in collaboration with MNOs and other European NSIs.

All publications were positively received, and widely covered by the media. This initiative undoubtedly confirmed the need for official statistics information on population present within a given place and time in addition to residential population, and on daily mobility.

Nevertheless, comparisons with INSEE reference/benchmark statistics showed also some shortcomings of the indicators based on network signaling data such as produced by MNOs for meeting official statistics needs and standards, namely the inherent uncertainty and variability of these kinds of data, and the population-adjustment process. The combination of sources from several MNOs allowed us to limit the impact of information collection problems. Measurement biases due to changes in the behavior of users could be corrected in the lockdown context through the final step of population adjustment performed by INSEE. However, this approach that assumed that the overall population present on the territory is constant day after day was eased by the very peculiar context of closed borders.

As a whole, the results showed consistent trends but detailed understanding of the phenomena at stake at a local level remained sometimes arduous, as reported, for instance, by the INSEE regional representatives.

5. Strengthen the Collaborations Between MNOs, Research Institutes, and NSIs to Take the Most of the Public Interest of MPD

The rapid response to the crisis was made possible by the fact that there was already substantial methodological expertise within the NSI on mobile data and their potential use for public interest. To prepare for future epidemics and more generally for public interest uses of these data, this investment must continue and be deepened. In fact, the production of robust and reliable statistics of public interest requires going further in the combination of sources, including the mobilization of raw data. As described above, commercial indicators are not always adapted to the NSI’s needs for official statistics (Cousin and Hillaireau, Reference Cousin and Hillaireau2018)—leading Eurostat and various NSIs to launch initiatives to design an end-to-end statistical process (Dattilo et al., Reference Dattilo, Radini and Sabato2016; Ricciato et al., Reference Ricciato, Lanzieri, Wirthmann and Seynaeve2020; ESSnet Big Data II WPI, n.d.). Consequently, work to assess the quality of the information that can be extracted from mobile data, to establish reliable and transparent statistical processing methods that protect privacy must continue. In this sense, multipartner research initiatives as well as experimental work on real data should be encouraged. As an example, the French National Research Agency funded project MobiTic (measuring people’s mobility and presence using information technologies and communication) brings together teams from Gustave Eiffel University, INSEE, CNRS, and Orange. MobiTic aims to produce a reliable, representative, and open-source estimation method of the population present in a given place and time and mobility statistics by combining digital and traditional data (https://mobitic.huma-num.fr). Like some other initiatives, such as the OPAL project (OPen ALgorithms for better decisions, https://www.opalproject.org), it aims to prepare, test, and validate algorithms that could be integrated directly into the MNO Information System. The technical issues of collecting, storing, and processing the massive data required for human mobility analyses have, in fact, long since been resolved.

6. Build Roles for Public Parties in Accordance with the Purposes They Are Mandated for

The great willingness of the parties to collaborate for the public interest in the context of the health crisis has made it possible to respond quickly and effectively. However, future partnerships to be built could even better take advantage of the roles that the parties can play in accordance with the purposes they are mandated for. For instance, the principles under which an NSI can build partnerships with mobile operators are: (i) neutrality: an NSI is open to work with each and all mobile operators, (ii) protection of privacy and business secrets: an NSI will protect both, and (iii) transparency and quality control: an NSI operates on strict transparency principles, which requires a complete traceability of the data used to produce official statistics and access to the needed information to assess and ensure the quality of the products. The COVID-19 crisis gave INSEE the first opportunity to release multioperator statistics on the population present in a given place and time. This position must be consolidated. The NSI is legitimate to combine sensitive information—such as penetration rates, coming from several operators, since it is used to protect business secrets (NSIs already receive very detailed financial information on firms through tax records, for instance) as well as privacy secrets.

7. A Regulatory Framework Aligned to GDPR

While the GDPR has well integrated the public interest objective of processing personal data (including geolocation data), the e-privacy regulation has not yet followed. EU telecom operators are subject to the e-privacy regulation and were therefore unable to make the billing data they permanently store available. Moreover, location data collected from electronic communication providers, such as MNOs, may only be processed within the remits of Articles 6 and 9 of the ePrivacy Directive. The national laws implementing the ePrivacy Directive specify that such data can only be used by the operator when they are made anonymous, or with the consent of the individuals. This regulation does not provide for an exemption for scientific research or public statistics, such as Article 89 of the GDPR (https://gdpr-info.eu/art-89-gdpr).The future regulation, currently under negotiation, should be aligned to GDPR.

8. Investments in Privacy-Preserving Techniques

Whether for research or for official statistics, it seems necessary to be able to develop anonymized aggregates from raw data in collaboration with the operator. Developing high-quality indicators requires exploiting individual and longitudinal microdata over a long period of time for adjustment. The development of technical solutions such as multiparty computing, which by design preserves privacy, could be a way to limit the reidentification risk and, at the same time, to allay MNOs’ concerns about revealing sensitive business information.

Investing in privacy-preserving techniques: issues for MNOs

On top of confidentiality guarantees, secure multiparty computation in which different actors collaborate on producing a common output from private data could be realistically considered when leading to product improvement. Nevertheless, such new solutions would require financial investments that most operators consider out of reach given the fragility and uncertainty of the current market.

9. Jointly Promote the Social Acceptability of a Reasoned Use of MPD

A central issue has been the tension between the reputational risks facing an MNO and its willingness to participate in the fight against pandemia. Even when only statistical indicators were actually used, the risk of possible suspicion of individual tracing was critical for the operator’s decision-makers. In fact, bad press and cost in terms of brand image seem to be a considerable risk to consider when using customer data, even when transformed into anonymous aggregates. However, the open collaborations with official statistics as described here serve as an example and make it possible to clarify the processing conditions in the interest of privacy while emphasizing the public interest of the information finally released, such as the post of INSEE blog that accompanied press releases (Sémécurbe et al., Reference Sémécurbe, Suarez Castillo, Galiana, Coudin and Poulhes2020).

10. Articulating Commercial and Public Interest Rationales

The simultaneous treatment of MPD for commercial purposes and for public statistical uses creates obvious tensions. Both aims are legitimate but follow different modus operandi and can conflict with each other if not articulated. The circumstances of the COVID-19 crisis were so exceptional that the issue was easily overcome, on an ad hoc basis, during the apex of the crisis. But, defining ex-ante the principle of such articulation in more normal circumstances appears to be critical to the sustainability of any public–private partnership. The commercial potential of mobile data lies in addressing the specific requirement of public or private customers through custom-made treatment (i.e., specific geographical and temporal scope), whereas NSIs are producing reference statistics which are more general and do not aim to address specific needs. The potential overlap appears minimal, unless the market remains a niche market. In addition, one of the constraints to grow such a market lies in the difficulty for the potential clients to assess the accuracy and value of the statistics produced. The simultaneous exploitation of mobile data by operators and NSI can be an opportunity for the NSI to contribute to the qualification and improvement of the data extraction process, thus contributing to the development of the market. It relies on the capacity of the NSI to access the needed information on raw data treatment by mobile operators while legitimately preserving business secrets.

11. Articulating Transparency and Business Secrets Requirements

The business aims pursued by mobile operators imply to protect the intellectual property deriving from the investment made to produce the data and develop the associated data treatments. The methodological investments constitute legitimate business secrets, which can be protected by law. In addition, when the raw data underlying the statistics are particularly sensitive on privacy grounds, the requirement to process the data transparently is even more acute. Similarly, as parternships are developed, mobile operators should document further the treatment being applied to produce the commercial statistics to improve their commercial value, while not revealing the core intellectual property. This documentation may not typically include the release of the detailed treatments and algorithms’ source codes to third parties. The NSIs, which are routinely subject to the requirement of documenting their statistics can assist in this process with their experience, to ensure that the information released is sufficient to assess the relevance and accuracy of the statistics produced. This potential contribution of NSIs appears as a natural synergy of NSIs/mobile operator partnership, as NSIs are required to understand and assess the relevance and quality of intermediate data used to produce public statistics.

12. Toward a Sound Governance

Even though all parties were willing to make exceptional efforts to contribute to addressing the challenges arising from the COVID-19 crisis, the need to ensure a strict compliance with relevant regulation and to clarify respective responsibilities was a real constraint. The principles under which an NSI can build partnerships with mobile operators should be reaffirmed. To enlarge the scope of future fruitful collaborations, governance schemes between mobile operators, NSIs, and public bodies should be further strengthened on an ex ante basis, as developing governance schemes in crisis periods is more challenging.

A future partnership between MNOs and INSEE? A conditional Yes!

The shared exploitation of mobile data by the individual MNOs and INSEE could be a real opportunity to improve the MNO statistical production, to multiply the users of this underused dataset and ultimately to contribute to the positive evolution of this market. However a win–win partnership is reachable only if public indicators seldom interfere with the operators’ market, and as long as MNOs can rely on financial incentive to put these operations in place.

Acknowledgements

The authors deeply thank Bouygues Telecom, Orange Business Services, and SFR data analytics units for having made the COVID-19 collaboration possible, as well as for their fruitful exchanges and discussions all along the COVID-19 collaboration, and hope this will continue. They are grateful to Pascal Chambreuil (OBS), Marc Jossermoz, and Loïc Lelievre (SFR Géostatistics) for fruitful discussions on building long-term partnerships between INSEE and mobile network operators. They also greatly thank Benoit Loutrel, Stefania Rubrichi, and Zbigniew Smoreda for helpful comments and discussions on earlier versions of the paper. This paper reflects the authors’ opinions.

Funding Statement

This work received no specific grant from any funding agency, commercial, or not-for-profit sectors.

Competing Interests

The authors declare no competing interests exist.

Data Availability Statement

The data concerning the exchanges between INSEE and MNOs following-up on a specifically designed questionnaire are available from the authors with the permission of the MNOs.

Author Contributions

Writing-review & editing, E.C., M.P., and M.S.C. The authors contributed equally to the article.

Footnotes

1 Boxes are presented here to highlight INSEE’s understanding of some of the issues faced by MNOs when building partnerships with NSIs, especially with a view to producing common statistical indicators from the MPDs. This understanding has emerged from exchanges between INSEE and MNOs following-up on a specifically designed questionnaire.

References

Batista E Silva, F., Carneiro Freire, S.M., Schiavina, M, Rosina, K., Marín Herrera, M.A., Ziemba, L.W., Craglia, M., Koomen, E. and Lavalle, C., (2020) Uncovering temporal changes in Europe’s population density patterns using a data fusion approach. Nature Communications 11, 4631. https://doi.org/10.1038/s41467-020-18344-5CrossRefGoogle ScholarPubMed
Blondel, VD, Decuyper, A and Krings, G (2015) A survey of results on mobile phone datasets analysis. EPJ Data Science 4, 10. https://doi.org/10.1140/epjds/s13688-015-0046-0CrossRefGoogle Scholar
Cousin, G and Hillaireau, F (2018) Can mobile phone data improve the measurement of international tourism in France?. Economie et Statistique 505(1), 89107. Available at https://www.insee.fr/fr/statistiques/3706178?sommaire=3706255Google Scholar
Dattilo, B, Radini, R and Sabato, M (2016). How many SIM in your luggage? A strategy to make mobile phone data usable in tourism statistics. In Istat, 14th Global Forum on Tourism Statistics, Italian National Institute of Statistics, Venice, Italy (pp. 1–16).Google Scholar
Debusschere, M, Sonck, J and Skaliotis, M (2016) Official statistics and mobile network operator partner up in Belgium, The OECD Statistics Newsletter, No. 65, pp. 11–14.Google Scholar
Legifrance (2020) Décret n°2020-260 du 16 mars 2020 portant réglementation des déplacements dans le cadre de la lutte contre la propagation du virus covid-19, Legifrance. Available at www.legifrance.gouv.fr/affichTexte.do?cidTexte=JORFTEXT000041728476&categorieLien=id (accessed 24 March 2020).Google Scholar
EUR-Lex (n.d.) Directive 2002/58/EC of the European Parliament and of the Council of 12 July 2002 concerning the processing of personal data and the protection of privacy in the electronic communications sector (Directive on privacy and electronic communications). Available at https://eur-lex.europa.eu/legal-content/EN/ALL/?uri=CELEX%3A32002L0058 (accessed 31 July 2002).Google Scholar
Galiana, LS, Castillo, M, Sémécurbe, F, Coudin, E and de Bellefon, MP (2020) Retour partiel des mouvements de population avec le déconfinement, Insee Analyses N°54, INSEE. Available at https://www.insee.fr/fr/statistiques/4635407Google Scholar
INSEE Press Release of April 8 (n.d.) Population présente sur le territoire avant et après le début du confinement: résultats provisoires. Available at https://www.insee.fr/fr/information/4477356 (accessed 4 August 2020).Google Scholar
INSEE Press Release of May 18 (n.d.) Population présente sur le territoire avant et après le début du confinement: résultats consolidés. Available at https://www.insee.fr/fr/information/4493611 (accessed 18 May 2020).Google Scholar
INSEE (2020) By the end of May, morning commutes had only reached 60% of their usual level. In INSEE Economic Outlook 2020. Available at https://www.insee.fr/en/statistiques/4620435?sommaire=4473307 (17 June 2020).Google Scholar
National Statistical Law Loi n°51-711 du 7 juin 1951 sur l’obligation, la coordination et le secret en matière de statistiques. Available at https://www.legifrance.gouv.fr/loda/id/JORFTEXT000000888573/2013-07-26 (accessed 25 March 2019).Google Scholar
Pullano, G, Valdano, E, Scarpa, N, Rubrichi, S and Colizza, V (2020) Evaluating the effect of demographic factors, socioeconomic factors, and risk aversion on mobility during the COVID-19 epidemic in France under lockdown: a population-based study, The Lancet Digital Health, volume 2, issue 12. https://doi.org/10.1016/S2589-7500(20)30243-0CrossRefGoogle ScholarPubMed
Regulation (EC) No 223/2009 of the European Parliament and of the Council. Available at https://eur-lex.europa.eu/legal-content/fr/ALL/?uri=CELEX%3A32009R0223 (31 Marc 2009).Google Scholar
Ricciato, F, Widhalm, P, Pantisano, F and Craglia, F (2017) Beyond the “single-operator, CDR-only” paradigm: an interoperable framework for mobile phone network data analyses and population density estimation. Pervasive and Mobile Computing 35, 6582.CrossRefGoogle Scholar
Ricciato, F, Lanzieri, G, Wirthmann, A and Seynaeve, G (2020) Towards a methodological framework for estimating present population density from mobile network operator data. Pervasive and Mobile Computing 68, 101263.CrossRefGoogle Scholar
Salje, H, Tran, Kiem C, Lefrancq, N, Courtejoie, N, Bosetti, P, Paireau, J, Andronico, A, Hozé, N, Richet, J, Dubost, CL, Le Strat, Y, Lessler, J, Levy-Bruhl, D, Fontanet, A, Opatowski, L, Boelle, PY, Cauchemez, S, (2020) Estimating the burden of SARS-CoV-2 in France. Science 369(6500), 208211. https://doi.org/10.1126/science.abc3517CrossRefGoogle ScholarPubMed
Sakarovitch, B, de Bellefon, M-P, Givord, P and Vanhoof, M (2018) Estimating the residential population from mobile phone data, an initial exploration. Economie et Statistique/Economics and Statistics 505–506, 109132. https://doi.org/10.24187/ecostat.2018.505d.1968Google Scholar
Sémécurbe, F, Suarez Castillo, M, Galiana, L, Coudin, E and Poulhes, M (2020) Que peut faire l’Insee à partir de données de téléphonie mobile? Mesure de population présente en temps de confinement et statistiques expérimentales, INSEE Blog Post. Available at https://blog.insee.fr/que-peut-faire-linsee-a-partir-des-donnees-de-telephonie-mobile-mesure-de-population-presente-en-temps-de-confinement-et-statistiques-experimentales (accessed 15 April 2020).Google Scholar
Tavernier, JL (2020) Official statistics and the challenges of the current health crisis, INSEE Blog Post. Available at https://blog.insee.fr/official-statistics-and-the-challenge-of-the-current-health-crisis (accessed 14 May 2020).Google Scholar
Vanhoof, M, Combes, S and de Bellefon, M-P (2017) Mining mobile phone data to detect urban areas. In Petrucci, A and Verdec, R (eds), SIS 2017 Statistics and Data Science: New Challenges, New Generations. Proceedings of the Conference of the Italian Statistical Society. Florence: Firenze University Press, pp. 10051012.Google Scholar
Vanhoof, M, Reis, F, Ploetz, T and Smoreda, Z (2018) Assessing the quality of home detection from mobile phone data for official statistics. Journal of Official Statistics 34(4), 935960. https://doi.org/10.2478/jos-2018-0046CrossRefGoogle Scholar
Submit a response

Comments

No Comments have been published for this article.

Author comment: The French official statistics strategy: Combining signaling data from various mobile network operators for documenting COVID-19 crisis effects on population movements and economic outlook — R0/PR1

Comments

September 25, 2020

Data&Policy

Editorial Board

Dear Editors,

Please find enclosed the manuscript entitled

The French official statistics strategy: combining signaling data from various MNOs for documenting COVID-19 crisis effect on population movements and economic outlook

written with Mathilde Poulhes (INSEE) and Milena Suarez Castillo (INSEE), which we submit for publication in the Special Collection on Telco Big Data Analytics for COVID-19 of Data&Policy.

Kind regards,

Elise Coudin, PhD

Head of SSP Lab

INSEE

Review: The French official statistics strategy: Combining signaling data from various mobile network operators for documenting COVID-19 crisis effects on population movements and economic outlook — R0/PR2

Conflict of interest statement

I'm a member of the ESS Task Force on Big Data/Trusted Smart Statistics and the coordinator and a researcher of the work package on mobile network data of the European project ESSnet on Big Data II, comprising several European statistical offices, in particular also the French statistical office (INSEE). I share research with the authors for the European Statistical System.

Comments

Comments to Author: This manuscript describes the collaboration between the French National Statistical Institute (INSEE) and French mobile network operators (MNOs hereafter) during the COVID-19 crisis to make use of mobile network data to provide insights about the human mobility of French population. The reuse of mobile network data in particular and of new digital data in general is a central issue both in data collaborative initiatives in the international community and in the production of official statistics in particular.

The article revises and summarises the main points of these collaborations, identifying key issues and proposing further joint research on statistical methodology as the baseline to take course of action for the use of this data source in the production of official statistics.

The identified key strategic issues for a future sustainable partnership are:

“Transmission of non-adjusted data to National Statistics Institutes (NSIs)”

The use of commercial final statistical products elaborated by MNOs’ data analysts poses a problem of quality for official statistical production in terms of compatibility of concepts, measures, and methods. The knowledge of some form of intermediate data by NSIs is perceived by MNOs as too high a risk because of perceived potential leakage of sensitive information to competitors (e.g. local market shares).

In my opinion, this is an extremely important finding arising from this experience.

“Transparency of methodology”

Harmonised structural metadata and open standardized statistical methodology are recommended to produce official statistics based on this data source. However, industrial secrecy and intellectual property rights regarding innovation are requested by MNOs to keep a competitive advantage in the market. Moreover, in this internationally standard scenario, adjustment to their production lines would be necessary, which requires further investments.

In my view, the recognition of legitimate interests of both MNOs and NSIs is a first condition to further agree on a course of action.

“Collaborations between MNOs, research institutes, and NSIs”

The technical, statistical, and business complexity of this data source requires such a joint collaboration in the international community in order “to take the most of the public interest of mobile phone data”.

The international dimension of this new data source is clearly stated and, in my opinion, is critical for the public sector and National Statistical Systems in particular.

- “Roles for public parties”

Although the COVID-19 crisis has allowed INSEE to release a multi-MNO statistic on present population and human mobility, a sustainable partnership is proposed to be built by clearly defining the roles for the statistical offices (such as “neutrality”, “protection of privacy and business secrets”, and “transparency and quality control”). Public parties are to act according to their mandates avoiding market disruption and sensitive information leakage or transmission. They are legally supported to combine sensitive information (as in their traditional statistical production processes).

My understanding is that this element is deeply entangled with the role of statistical offices in the new emerging scenario with the increasing data deluge. I agree that the traditional role of NSIs to collect and combine even personal data (e.g. survey data) must be clearly stated and underlined. This does not mean whatsoever that privacy and confidentiality are violated at any moment.

- “Regulatory framework”

The legal framework needs further adaptation to completely align the protection of privacy and confidentiality of citizens, the social need to produce official statistics, and the legitimate business and commercial interests of private companies.

In my opinion, this is another highly important piece of information arising from this experience.

- “Investing in privacy-preserving techniques”

New techniques such as secure multiparty computation needs to be jointly researched and incorporated into the production of official statistics based on mobile network data.

In my view, this is intimately connected with the need for collaborations. The inclusion of these techniques to process information in a secured way is rooted in the general social interest and consequently must be jointly undertaken.

- “Social acceptability”

A central issue is the communication policy regarding the use of mobile network data covering all facets and underlining transparency, openness, and privacy preservation.

I fully agree with this point. The communication policy is a central issue by which the citizen must be clearly informed about what data are processed (the telco data ecosystem is very complex and only some data are indeed needed), how they are processed (open methodology), and who process them (roles and responsibilities of each actors).

- “Articulating transparency and business secrets requirements”

Statistical offices are suggested to play a strategic role in conjugating both the required transparency of the treatment of this sensitive data source and the protection of intellectual property underlying the private investment by MNOs.

In my view, this is a strategic proposal and approach aiming at providing constructive solutions. Statistical offices are suggested to adopt new roles in which they simultaneously protect data privacy and confidentiality of citizens (as with traditional data sources) and foster the digital data and statistical markets assisting in the open but protected documentation of underlying techniques and algorithms.

- “Sound governance”

Governance schemes delimiting roles and responsibilities in potential partnerships should be built for a sustainable production.

In my opinion, this is again entangled with the role of statistical offices in the new emerging data and statistical markets. I fully agree that roles and responsibilities should be clearly delimited especially in connection with the figure of data stewardships in data collaborative initiatives.

As the main conclusion, INSEE sees a real opportunity to incorporate this data source into the production of official statistics, “to improve the MNO statistical production, to multiply the users of this […] data and ultimately to contribute to the positive evolution of this market”. A need to combine public and private interests and to find financial incentive for MNOs is declared.

Any of the preceding key issues invite to elaborate deeper reflections, in connection e.g. with international (European) and national data strategies, the figure of data stewards, or the economic costs and investments needed to make use of this data source for statistical purposes (just briefly mentioned in the authors’ final statement). All these could be also briefly mentioned in the article and certainly will need to be taken into account in the construction of partnerships, but this does not decrease the value of this important contribution. INSEE is providing a fairly complete set of key insights about the incorporation of mobile network data in the production of official statistics according to empirical evidence.

The work is suitable for publication, technically correct, and scientifically and strategically sound. Thus, it is my recommendation that it should be published as part of the Special Collection on Telco Big Data Analytics for COVID-19 in the journal Data & Policy.

Review: The French official statistics strategy: Combining signaling data from various mobile network operators for documenting COVID-19 crisis effects on population movements and economic outlook — R0/PR3

Conflict of interest statement

No Conflicts of Interest.

Comments

Comments to Author: What I miss in the argumentation is the following: Traditional business statistics also require very sensitive data from enterprises: wages, turnover, balance sheet data, etc. So far, this has not been without a certain resistance and scepticism. Nevertheless, the trust in the confidentiality of these data in the statistics institute and the corresponding (legal) obligation to respond have made these data deliveries possible (even if not very much liked). Where is the difference? According to an approach to governance based on voluntary ('philantropical') contributions, the entire existing relationship between governmental tasks to provide economic and socio-political information and the obligations of companies to participate in it is called into question. Should they possibly only pay taxes in the future if this results in a win-win situation for them? In my opinion, it must be justified why the state (here in the form of the statistical authority) should engage in such a private-sector logic and not apply the one, which is typically used in traditional business statistics. BTW. 'Philantropy' is an approach, which is mainly based on and to be understood as part of a US American culture, which is however relatively new to European more state centric cultures, such as the French one. In this sense, I would be hesitant to apply this approach alongside the typical European form of interpreting institutions, where official statistics represent the state with its authority and its obligations vis-à-vis its citzens.

Recommendation: The French official statistics strategy: Combining signaling data from various mobile network operators for documenting COVID-19 crisis effects on population movements and economic outlook — R0/PR4

Comments

Comments to Author: A good paper, giving the perspective of an NSO on the usefulness of telco big data for mobility data in the context of COVID-19. Good learnings and identification of challenges.

Decision: The French official statistics strategy: Combining signaling data from various mobile network operators for documenting COVID-19 crisis effects on population movements and economic outlook — R0/PR5

Comments

No accompanying comment.