TLDR
#Install/Update python3 modules ( tested with Python 3.6 / Ubuntu )
pip3 install -U SPARQLWrapper
pip3 install -U fiona
pip3 install -U csvtomd
pip3 install -U requests
#run from the project root ( expected 30-40 minutes )
# be careful this is running 'make all'
./run_all.sh
# Check the log file
cat x_tempshape/run_all.log
# Check the summary ..
x_tempshape/update.md
# Check the individula changes - every tables - in the temp_shape/*
# Check the logs & shape files
# if everything is ok, move the shape files to the correct folder
Run individual process ( updating "10m_physical/ne_10m_lakes_north_america.shp" )
# |mode |LetterCase| shape_path | shape filename
# == 10m ================= |==== |==========| ============| ================================================
./tools/wikidata/update.sh all lowercase 10m_physical ne_10m_lakes_north_america
mode =
- fetch = fetch Wikidata Labels(names) via SPARQL - and create a csv file
- write = create a new temp Shape file with the new wikidata names
- fetch_write = fetch and write
- copy = copy the temp Shape + audit files to the original place
- all = fetch + write + copy
step by step
# |mode |LetterCase| shape_path | shape filename
# ======================== |=========== |==========| ============| ================================================
# fetch Wikidata Labels(names) via SPARQL - and create a csv file
./tools/wikidata/update.sh fetch lowercase 10m_physical ne_10m_lakes_north_america
# create a new temp Shape file with the new wikidata names
./tools/wikidata/update.sh write lowercase 10m_physical ne_10m_lakes_north_america
# copy the temp Shape + audit files to the original place
./tools/wikidata/update.sh copy lowercase 10m_physical ne_10m_lakes_north_america
### /temp_shape/10m_physical/ne_10m_lakes_north_america.changes_log.csv # Column changes - csv format
"wd_id","status","variable","value_old","value_new" "Q1323525","NEWvalue","name_ko","","워싱턴 호" "Q7356585","MODvalue","name_fr","William","William 'Bill' Dannelly Reservoir" "Q15118728","NEWvalue","name_en","","Little Salmon Lake" "Q7236081","NEWvalue","name_de","","Powell Lake" "Q7236081","NEWvalue","name_es","","Powell Lake" "Q7236081","NEWvalue","name_it","","Powell Lake" "Q7236081","NEWvalue","name_nl","","Powell Lake" "Q22702352","REDIRECT","wikidataid","Q22702352","Q1799606" "Q22702352","MODvalue","name_de","lac Pusticamica","Lac Pusticamica" "Q1800890","MODvalue","name_en","Lake Chemong","Chemong Lake" "Q1800890","NEWvalue","name_sv","","Chemong Lake"
### ./temp_shape/10m_physical/ne_10m_lakes_north_america.changes_log.csv.md # Column changes - markdown
wd_id | status | variable | value_old | value_new
-----------|------------|--------------|-------------------|-----------------------------------
Q1323525 | NEWvalue | name_ko | | 워싱턴 호
Q7356585 | MODvalue | name_fr | William | William 'Bill' Dannelly Reservoir
Q15118728 | NEWvalue | name_en | | Little Salmon Lake
Q7236081 | NEWvalue | name_de | | Powell Lake
Q7236081 | NEWvalue | name_es | | Powell Lake
Q7236081 | NEWvalue | name_it | | Powell Lake
Q7236081 | NEWvalue | name_nl | | Powell Lake
Q22702352 | REDIRECT | wikidataid | Q22702352 | Q1799606
Q22702352 | MODvalue | name_de | lac Pusticamica | Lac Pusticamica
Q1800890 | MODvalue | name_en | Lake Chemong | Chemong Lake
Q1800890 | NEWvalue | name_sv | | Chemong Lake
### ./temp_shape/10m_physical/ne_10m_lakes_north_america.new_names.csv # input csv
```bash
$ cat ./temp_shape/10m_physical/ne_10m_lakes_north_america.new_names.csv | head
"wd_id","wd_id_new","population","name_ar","name_bn","name_de","name_en","name_es","name_fr","name_el","name_hi","name_hu","name_id","name_it","name_ja","name_ko","name_nl","name_pl","name_pt","name_ru","name_sv","name_tr","name_vi","name_zh"
"Q1426999","","","","","Theodore Roosevelt Lake","Theodore Roosevelt Lake","","","","","","","","","","","","","Рузвельт","","","",""
"Q4397897","","","","","","Ross Barnett Reservoir","","","","","","","","","","","","","Росс Барнетт","","","",""
"Q175554","","","","","Walker Lake","Walker Lake","","Walker Lake","","","Walker-tó","","","ウォーカー湖","","Walker Lake","","","Уокер","","","",""
"Q6908686","","","","","","Mooselookmeguntic Lake","","Mooselookmeguntic Lake","","","","","","","","","","","Муслукмегантик","","","",""
"Q1110527","","","","","Priest Lake","Priest Lake","","Priest Lake","","","","","","","","","","","Прист","","","",""
"Q1627906","","","","","","Caddo Lake","","lac Caddo","","","","","lago Caddo","","","Caddo Lake","","","Каддо","","","",""
"Q4261031","","","","","","Lake Livingston","","lac Livingston","","","","","","","","","","","Ливингстон","","","",""
"Q4231229","","","","","","Lake Conroe","","Lake Conroe","","","","","","","","","","","Конро","","","",""
"Q2365354","","","","","Summer Lake","Summer Lake","","Summer Lake","","","","","","","","","","","Саммер","","","",""
...
./temp_shape/10m_physical/ne_10m_lakes_north_america.summary_log.csv # Summary of the changes - csv
"shapefilename","var","value"
"10m_physical/ne_10m_lakes_north_america.shp","New_name","7"
"10m_physical/ne_10m_lakes_north_america.shp","Deleted_name","0"
"10m_physical/ne_10m_lakes_north_america.shp","Modified_name","3"
"10m_physical/ne_10m_lakes_north_america.shp","Empty_name ","7899"
"10m_physical/ne_10m_lakes_north_america.shp","Same_name","1604"
"10m_physical/ne_10m_lakes_north_america.shp","Wikidataid_redirected","1"
"10m_physical/ne_10m_lakes_north_america.shp","Wikidataid_notfound","0"
"10m_physical/ne_10m_lakes_north_america.shp","Wikidataid_null","747"
"10m_physical/ne_10m_lakes_north_america.shp","Wikidataid_notnull","453"
"10m_physical/ne_10m_lakes_north_america.shp","Wikidataid_badformated","0"
./temp_shape/10m_physical/ne_10m_lakes_north_america.shp.summary_log.csv.md # Summary of the changes - markdown
shapefilename | var | value |
---|---|---|
10m_physical/ne_10m_lakes_north_america.shp | New_name | 7 |
10m_physical/ne_10m_lakes_north_america.shp | Deleted_name | 0 |
10m_physical/ne_10m_lakes_north_america.shp | Modified_name | 3 |
10m_physical/ne_10m_lakes_north_america.shp | Empty_name | 7899 |
10m_physical/ne_10m_lakes_north_america.shp | Same_name | 1604 |
10m_physical/ne_10m_lakes_north_america.shp | Wikidataid_redirected | 1 |
10m_physical/ne_10m_lakes_north_america.shp | Wikidataid_notfound | 0 |
10m_physical/ne_10m_lakes_north_america.shp | Wikidataid_null | 747 |
10m_physical/ne_10m_lakes_north_america.shp | Wikidataid_notnull | 453 |
10m_physical/ne_10m_lakes_north_america.shp | Wikidataid_badformated | 0 |
My best practice ...
- Run step by step ( line by line) :
./run_all.sh
infetch_write
mode - check the audit csv files ( Open by Libreoffice , filter )
- find & fix the 'fake' wikidata changes :(
- iterate or modify input csv and write shape files
- check shape files and move the shape files to the correct folders
known problems
updating 10m_cultural/ne_10m_admin_1_label_points_details.shp
I got a lot of warnings:
WARNING:Fiona:CPLE_AppDefined in b'Value -4.75267000000000017 of field longitude of feature 4645 not successfully written. Possibly due to too larger number with respect to field width'
WARNING:Fiona:CPLE_AppDefined in b'Value 10.2509999999999994 of field latitude of feature 4646 not successfully written. Possibly due to too larger number with respect to field width'
WARNING:Fiona:CPLE_AppDefined in b'Value -3.34011000000000013 of field longitude of feature 4646 not successfully written. Possibly due to too larger number with respect to field width'
...
uppercase / lowercase variable names
lettercase = uppercase variable names [WIKIDATAID, NAME_AR, NAME_BN, NAME_DE, NAME_EN, NAME_ES, ... ]
- 10m_cultural/ne_10m_admin_0_countries_lakes.shp
- 10m_cultural/ne_10m_admin_0_countries.shp
- 10m_cultural/ne_10m_admin_0_disputed_areas.shp
- 10m_cultural/ne_10m_admin_0_map_subunits.shp
- 10m_cultural/ne_10m_admin_0_map_units.shp
- 10m_cultural/ne_10m_admin_0_sovereignty.shp
- 50m_cultural/....
- 110m_cultural/....
lettercase = lowercase variable names [wikidataid, name_ar, name_bn, name_de, name_en, name_es, ... ]
- 10m_cultural/ne_10m_admin_1_states_provinces_lakes.shp
- 10m_cultural/ne_10m_admin_1_states_provinces.shp
- 10m_cultural/ne_10m_airports.shp
- 10m_cultural/ne_10m_populated_places.shp
- 10m_physical/ne_10m_geographic_lines.shp
- 10m_physical/ne_10m_geography_marine_polys.shp
- 10m_physical/ne_10m_geography_regions_elevation_points.shp
- 10m_physical/ne_10m_geography_regions_points.shp
- 10m_physical/ne_10m_geography_regions_polys.shp
- 10m_physical/ne_10m_lakes_europe.shp
- 10m_physical/ne_10m_lakes_historic.shp
- 10m_physical/ne_10m_lakes_north_america.shp
- 10m_physical/ne_10m_lakes.shp
- 10m_physical/ne_10m_playas.shp
- 10m_physical/ne_10m_rivers_europe.shp
- 10m_physical/ne_10m_rivers_lake_centerlines_scale_rank.shp
- 10m_physical/ne_10m_rivers_lake_centerlines.shp
- 10m_physical/ne_10m_rivers_north_america.shp
- 10m_cultural/ne_10m_admin_1_label_points_details.shp
- 50m_cultural/...
- 50m_physical/...
- 110m_cultural/...
- 110m_physical/...
see the latest information in the ./run_all.sh