top of page

Earth data

Global Sunrise & Sunset Data:

Description: Global pixel level, daily data of the sunrise and sunset time from year 1700 to present.

Python code:  Modify the Python code to scrape the data in any location and any period.

Stata codethe Stata code to clean above scraped data.

Data SourceU.S. Navel Observatory

PaperNutrition, Labor Supply, and Productivity: Evidence from Ramadan in Indonesia

 

Global Temperature Data:

 

Description: Global pixel level, daily data of the average, minimum, maximum temperature from 1979 to present.

Python codethe Python code to transfer the original NetCDF format data to CSV.

Stata codethe Stata code to merge and clean above CSV data.

Data Source: Climate Prediction Center Global Temperature Time Series

 

 

Search Index data

Google Search Index Data:

Description: Monthly to minutely, country to city level data of search index of selected keywords in all countries from 2004 to latest date.

Python code: the Python code to scrape the search index of any keyword in any region in any time from 2004.

Data Source: Google trends

Note: The search index data could be used to predict latest trends or as outcome variables to measure intention. Daily, hourly, minutely level data only available in past 90 days, past 7 days, and past 4 hours, respectively

Scraping data from Other Webs

Book/Movie Ratings:

Description: Data from Douban, a leading book and movie review website with 0.2 billion users in China.

Python code: the Python code to scrape book information, such as title, author, rating, etc. The code  could be used to scrape other information from Douban pages with a similar structure.

Data Source: Douban

Wikipedia

Description: Encyclopedia information about any object from Wikipedia

Python code: E.g., the Python code to scrape the country code and official languages of any country from Wikipedia.

Data Source: Wikipedia

Online product data

Description: Information about products sold on online shopping website

Python code: E.g., the Python code to scrape product name, brand, shipping fees. It may also be adapted to gather other key information, such as price, rating.

Data Source: Newegg

 

 
bottom of page