src/hdalab/README.txt
author rougeronj
Thu, 20 Nov 2014 15:10:35 +0100
changeset 366 cd359ba0137b
parent 363 627596669480
child 373 1e2c3abcc888
permissions -rw-r--r--
Update block trans for Django translation. Minor corrections in the text
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
130
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
     1
== Inclusion géographique ==
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
     2
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
     3
La migration 0007_geographic_inclusion crée les tables nécessaires
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
     4
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
     5
Pour récupérer les informations d'inclusion géographique de DBpedia (nécessite le package python SparqlWrapper)
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
     6
    python manage.py query_geo_inclusion
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
     7
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
     8
== Traitement du fichier countries.geo.json ==
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
     9
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    10
Le fichier a été téléchargé sur https://github.com/johan/world.geo.json/
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    11
Afin de faire correspondre les labels des pays aux tags sémantisés, on applique le script
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    12
    python manage.py geojson_transform <chemin/nom_du_fichier.geo.json>
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    13
pour rajouter ces informations dans le fichier
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    14
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    15
== Import des données Insee ==
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    16
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    17
La migration 0008_datasheet_insee crée les tables nécessaires
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    18
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    19
Il faut d'abord importer les fichiers donnant les coordonnées géographiques par code Insee dans la base.
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    20
Le fichier data/villes.csv a été téléchargé sur http://www.pillot.fr/cartographe/fic_villes.php
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    21
Il manque dans ce fichier quelques villes et les codes INSEE pour Paris, Marseille et Lyon n'incluent pas les arrondissements
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    22
Les communes manquantes se trouvent dans additional_cities.csv
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    23
Pour importer chacun de ces fichiers :
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    24
    python manage.py import_insee_csv <chemin/nom_du_fichier.csv>
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    25
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    26
Il faut ensuite importer les fichiers donnant la correspondance entre notices HDA (référencées par leur identifiant hda_id) et codes INSEE
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    27
Celles-ci sont fournies dans un fichier Excel donné par Bertrand, que j'ai converti en CSV et corrigé. Il s'agit de data/hda_insee.csv
c8af52e4a047 Information about migrations
veltr
parents: 127
diff changeset
    28
Pour importer ce fichier:
227
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    29
    python manage.py import_insee_hda_csv <chemin/nom_du_fichier.csv>
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    30
    
363
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    31
== Mise en place de l'Environnement Virtuel ==
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    32
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    33
0) requis : python 2.6 (64 bits).
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    34
1) installer PostGreSql 9.X car ça tourne en 64 bits.
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    35
3) $ export PYTHONPATH=/path/to/workspace/hdabo/web (/Users/tc/dev/eclipse_workspace/hdabo/web)
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    36
4) ajouter le path jusqu'à pg_config dans les vars d'env : ajouter dans la ligne suivante dans ~/.bashrc :
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    37
  export PATH=$PATH:/Library/PostgreSQL/9.X/bin
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    38
- faire en sorte que ~/.bashrc soit bien pris en compte quand on lance un terminal : ajouter dans la ligne suivante dans /etc/bashrc :
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    39
  [ -r ~/.bashrc ] && . ~/.bashrc
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    40
- se mettre dans le bon répertoire et la création du virtualenv :
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    41
  $ cd /path/to/workspace/hdabo/web (/Users/tc/dev/eclipse_workspace/hdabo/web)
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    42
  $ python create_python_env.py
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    43
  $ python project-boot.py --no-site-packages --type-install=local env/myhdaboenv
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    44
627596669480 Update Config template, and add instruction to setting up the virtual env for hdalab in the readme
rougeronj
parents: 355
diff changeset
    45
227
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    46
=== Migration hdabo -> hdalab ===
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    47
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    48
0) appliquer les migrations south
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    49
1) lancement de la commande query_wikipedia_category
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    50
2) lancement de la commande fill_tag_years
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    51
3) lancement de la commande query_wikipedia
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    52
4) lancement de la commande query_geo_inclusion
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    53
5) lancement de la commande geo_json_transform
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    54
6) lancement de la commande import_insee_csv
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    55
7) lancement de la commande import_hda_insee_csv
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    56
8) lancement de la commande query_category_inclusion
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    57
228
92d733f0d179 Add some trace.
ymh <ymh.work@gmail.com>
parents: 227
diff changeset
    58
Toutes ces actions sont echaînées dans la commande import_hdabo_db
227
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    59
b0cd3e6e31c7 Update readmin on hdabo -> hdalab migration and update import_hdabo_db management commend
ymh <ymh.work@gmail.com>
parents: 130
diff changeset
    60
355
5d6d75e21634 little readme for migrate hdalab1 to hdalab2
cavaliet
parents: 272
diff changeset
    61
=== Migration hdalab 1 -> hdalab 2 ===
228
92d733f0d179 Add some trace.
ymh <ymh.work@gmail.com>
parents: 227
diff changeset
    62
355
5d6d75e21634 little readme for migrate hdalab1 to hdalab2
cavaliet
parents: 272
diff changeset
    63
1) syncd
5d6d75e21634 little readme for migrate hdalab1 to hdalab2
cavaliet
parents: 272
diff changeset
    64
2) migrate (migrate passe les hdabo_tags à fr.dbdpedia)
5d6d75e21634 little readme for migrate hdalab1 to hdalab2
cavaliet
parents: 272
diff changeset
    65
3) query_dbpedia
5d6d75e21634 little readme for migrate hdalab1 to hdalab2
cavaliet
parents: 272
diff changeset
    66
4) fill_tag_years
5d6d75e21634 little readme for migrate hdalab1 to hdalab2
cavaliet
parents: 272
diff changeset
    67
5) query_geo_inclusion