Sample Data Description of mPat

This 400MB dataset contains 4 kinds of data: CDR Data, Smartcard Data, Taxicab GPS data and Bus GPS data of Shenzhen City China from 6AM 2013-10-22 to 12AM 2013-10-23. This dataset is for academic research only. All rights reserved. All identifiable IDs have been replaced by serial numbers in each kind of data for privacy concerns.

(1) CDR Data Format

0055556100,2013-10-22 08:27:50,114.121305,22.57902 SIM Card ID, Time, Latitude, Longitude

(2) Smartcard Data Format

000000064,2013-10-22 09:49:33,31,26路 Smartcard ID, Time, Transaction type (21, 22, 31), Metro Station or Bus Line

Transaction Type:

  • 31-Bus Boarding
  • 21-Subway Swiped-In
  • 22-Subway Swiped-Out
(3) Taxi GPS Data Format

22223,2013-10-22 08:49:25,114.116631,22.582466,0 Taxi ID, Time, Latitude, Longitude, Occupancy Status

Occupancy Status

  • 1-with passengers
  • 0-with passengers
(4) Bus GPS Data Format

55640,2013-10-22 09:13:16,03950,113.890800,22.580256 BUS ID, Time, Bus Lines, Latitude, Longitude


We have been negotiating with service providers in Shenzhen to publish more data safely. But due to security and privacy concerns, some data used in the paper cannot be made public currently. Please follow our future work. Please cite the following paper when using this dataset.

Desheng Zhang, Jun Huang, Ye Li, Fan Zhang, Chengzhong Xu, and Tian He. Exploring Human Mobility with Multi-Source Data at Extremely Large Metropolitan Scales. In MobiCom '14, 2014.

Desheng Zhang (

Tian He (tianhe@cs.umn.eud)

Fan Zhang (