src/hdalab/management/commands/query_dbpedia.py
author ymh <ymh.work@gmail.com>
Wed, 11 Apr 2018 12:19:47 +0200
branchdocumentation
changeset 693 09e00f38d177
parent 571 d9642be7c937
permissions -rw-r--r--
Add hdabo/hdalab documentations
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
     1
# -*- coding: utf-8 -*-
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
     2
'''
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
     3
Requête DBPedia pour renseigner les objets :class:`hdabo.models.Tag`.
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
     4
Seuls les tags sémantisés sont traités.
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
     5
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
     6
Les données suivantes sont moissonnées:
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
     7
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
     8
  - label dans toutes les langues disponibles
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
     9
  - résumé dans toutes les langues disponibles
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    10
  - thumbnail
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    11
  - lien entre les tags
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    12
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    13
**Usage**: ``django-admin query_dbpedia [options]``
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    14
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    15
**Options spécifiques:**
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    16
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    17
    - *\-\-all* :               force à traiter tous les tags
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    18
    - *\-\-random* :            faire le traitement des tags dans un ordre aléatoire
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    19
    - *\-\-force* :             ne pose aucune question
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    20
    - *\-\-limit=LIMIT* :       Nombre de tags à traiter
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    21
    - *\-\-start=START* :       Nombre de tags à ignorer
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    22
    - *\-\-tag=TAG* :           Limite le traitement à ce tag
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    23
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    24
'''
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    25
359
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    26
from hdabo.models import Tag
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    27
from hdabo.utils import show_progress
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    28
from hdalab.models import DbpediaFields, TagLinks
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    29
from hdalab.models.dataviz import DbpediaFieldsTranslation
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    30
import logging
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    31
from optparse import make_option
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    32
import sys
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    33
import traceback
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    34
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    35
from django import db
281
bc0f26b1acc2 Hdalab : commands now work after update. Requests update with a dbpedia url from settings.
cavaliet
parents: 279
diff changeset
    36
from django.conf import settings
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    37
from django.core.management.base import NoArgsCommand
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    38
from django.core.management.color import no_style
359
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    39
from django.db import transaction
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    40
from django.db.models import Count
359
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    41
from rdflib import URIRef, Graph
361
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
    42
import requests
359
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    43
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    44
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
    45
logger = logging.getLogger(__name__)
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    46
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    47
class Command(NoArgsCommand):
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    48
    '''
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    49
    query and update wikipedia for tag title.
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    50
    '''
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    51
    options = ''
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    52
    help = """query and update wikipedia for tag title."""
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    53
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    54
    option_list = NoArgsCommand.option_list + (
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    55
        make_option('--all',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    56
            action='store_true',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    57
            dest='all',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    58
            default=False,
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    59
            help='force all tags to be updated, not only those not yet processed'),
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    60
        make_option('--force',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    61
            action='store_true',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    62
            dest='force',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    63
            default=False,
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    64
            help='ask no questions'),
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    65
        make_option('--random',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    66
            action='store_true',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    67
            dest='random',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    68
            default=False,
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    69
            help='randomize query on tags'),
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    70
        make_option('--limit',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    71
            action='store',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    72
            type='int',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    73
            dest='limit',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    74
            default= -1,
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    75
            help='number of tag to process'),
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    76
        make_option('--start',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    77
            action='store',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    78
            type='int',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    79
            dest='start',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    80
            default=0,
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    81
            help='number of tag to ignore'),
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    82
        make_option('--tag',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    83
            action='append',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    84
            dest='tags',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    85
            type='string',
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    86
            default=[],
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    87
            help='the tag to query'),
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
    88
    )
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
    89
361
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
    90
    def query_dbpedia(self, query, fmt='n3'):
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
    91
        url = settings.DBPEDIA_URI_TEMPLATE % ( 'sparql', '' )
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
    92
        params = {
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
    93
            'query': query,
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
    94
            'format': {'n3':'text/turtle', 'rdf/xml':"application/rdf+xml", 'nt': 'text/plain'}.get(fmt, 'text/turtle')
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
    95
        }
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
    96
        resp = requests.get(url, params=params)
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
    97
        logger.debug("Query dbpedia : %s", resp.text)
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
    98
        return Graph().parse(data=resp.text, format=fmt)
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
    99
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
   100
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   101
    def handle_noargs(self, **options):
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   102
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   103
        self.style = no_style()
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   104
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   105
        self.interactive = options.get('interactive', True)
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   106
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   107
        self.verbosity = int(options.get('verbosity', '1'))
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   108
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   109
        self.force = options.get('force', False)
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   110
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   111
        self.limit = options.get("limit", -1)
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   112
        self.start = options.get("start", 0)
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   113
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   114
        self.random = options.get('random', False)
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   115
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   116
        if self.verbosity > 2:
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   117
            print "option passed : " + repr(options)
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   118
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   119
        self.tag_list = options.get("tags", []);
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   120
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   121
        queryset = Tag.objects.exclude(dbpedia_uri= None)
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   122
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   123
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   124
        if self.tag_list:
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   125
            queryset = queryset.filter(label__in=self.tag_list)
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   126
        elif not options.get('all',False):
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   127
            queryset = queryset.annotate(dbfc=Count('dbpedia_fields')).filter(dbfc = 0)
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   128
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   129
        if self.random:
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   130
            queryset = queryset.order_by("?")
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   131
        else:
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   132
            queryset = queryset.order_by("label")
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   133
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   134
        if self.limit >= 0:
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   135
            queryset = queryset[self.start:self.limit]
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   136
        elif self.start > 0:
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   137
            queryset = queryset[self.start:]
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   138
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   139
        if self.verbosity > 2 :
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   140
            print "Tag Query is %s" % (queryset.query)
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   141
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   142
        count = queryset.count()
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   143
135
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   144
        if count == 0:
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   145
            print "No tag to query : exit."
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   146
            return
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   147
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   148
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   149
        if not self.force and self.interactive:
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   150
            confirm = raw_input("You have requested to query and replace the dbpedia information for %d tags.\n Are you sure you want to do this? \nType 'yes' to continue, or 'no' to cancel: " % (count))
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   151
        else:
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   152
            confirm = 'yes'
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   153
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   154
        if confirm != "yes":
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   155
            print "dbpedia query cancelled"
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   156
            return
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   157
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   158
        writer = None
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   159
        for i,tag in enumerate(queryset):
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   160
            writer = show_progress(i+1, count, tag.label, 50, writer)
135
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   161
            db.reset_queries()
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   162
240
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   163
            #abstract query
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   164
            #"select ?y
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   165
            # where {<%s>  <http://dbpedia.org/ontology/abstract> ?y}" % (tag.dbpedia_uri)
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   166
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   167
            #rdf_uri = re.sub('\/resource\/', "/data/", tag.dbpedia_uri) + ".n3"
240
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   168
            #g = Graph()
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   169
            try :
240
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   170
                abstracts = {}
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   171
                labels = {}
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   172
                thumbnail = None
571
d9642be7c937 replace commit_on_success with atomic
ymh <ymh.work@gmail.com>
parents: 361
diff changeset
   173
                with transaction.atomic():
361
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
   174
                    res_abstracts = self.query_dbpedia("select distinct ?y where {<%s>  <http://dbpedia.org/ontology/abstract> ?y}" % (tag.dbpedia_uri), 'n3')
240
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   175
                    for _,_,o in res_abstracts.triples((None, URIRef('http://www.w3.org/2005/sparql-results#value'), None)):
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   176
                        abstracts[o.language] = (unicode(o), True)
359
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
   177
                    logger.debug("Abstracts: %r" % abstracts)
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   178
361
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
   179
                    res_labels = self.query_dbpedia("select distinct ?y where {<%s>  <http://www.w3.org/2000/01/rdf-schema#label> ?y}" % (tag.dbpedia_uri), 'n3')
240
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   180
                    for _,_,o in res_labels.triples((None, URIRef('http://www.w3.org/2005/sparql-results#value'), None)):
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   181
                        labels[o.language] = (unicode(o), True)
359
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
   182
                    logger.debug("Labels: %r" % labels)
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   183
361
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
   184
                    res_thumbnails = self.query_dbpedia("select distinct ?y where {<%s>  <http://dbpedia.org/ontology/thumbnail> ?y} limit 1" % (tag.dbpedia_uri), 'n3')
240
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   185
                    for _,_,o in res_thumbnails.triples((None, URIRef('http://www.w3.org/2005/sparql-results#value'), None)):
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   186
                        thumbnail = unicode(o)
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   187
361
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
   188
                    res_links = self.query_dbpedia('select distinct ?y where { <%s> ?p ?y . FILTER regex(?y, "^%s")}' % (tag.dbpedia_uri, settings.DBPEDIA_URI_TEMPLATE % ( 'resource', '' )), 'n3')
240
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   189
                    for _,_,o in res_links.triples((None, URIRef('http://www.w3.org/2005/sparql-results#value'), None)):
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   190
                        tagqs = Tag.objects.filter(dbpedia_uri=unicode(o))
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   191
                        if tagqs:
359
46ad324f6fe4 Correct qery_dbpedia and improve model.
ymh <ymh.work@gmail.com>
parents: 284
diff changeset
   192
                            TagLinks.objects.get_or_create(subject=tag, object=tagqs[0])
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   193
240
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   194
                    ref_label_lang, (ref_label, _) = ('fr',labels['fr']) if 'fr' in labels else ('en',labels['en']) if 'en' in labels else labels.items()[0] if len(labels) > 0 else ('fr',(tag.label, True))
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   195
                    ref_abstract_lang, (ref_abstract, _) = ('fr',abstracts['fr']) if 'fr' in abstracts else ('en',abstracts['en']) if 'en' in abstracts else abstracts.items()[0] if len(abstracts) > 0 else ('fr',(None, 'True'))
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   196
135
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   197
                    for lang in settings.LANGUAGES:
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   198
                        if lang[0] not in labels:
240
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   199
                            labels[lang[0]]= (ref_label, False)
135
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   200
                        if lang[0] not in abstracts:
240
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   201
                            abstracts[lang[0]] = (ref_abstract, False)
135
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   202
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   203
                    dbfield , created = DbpediaFields.objects.get_or_create(tag=tag, defaults={'dbpedia_uri':tag.dbpedia_uri, 'abstract':ref_abstract, 'thumbnail':thumbnail, 'label':ref_label}) #@UndefinedVariable
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   204
                    if not created:
361
a1b85604132c upgrade rdflib to correct unicode bugs
ymh <ymh.work@gmail.com>
parents: 360
diff changeset
   205
                        dbfield.dbpedia_uri = tag.dbpedia_uri
135
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   206
                        dbfield.abstract = ref_abstract
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   207
                        dbfield.thumbnail = thumbnail
135
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   208
                        dbfield.label = ref_label
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   209
                        dbfield.save()
135
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   210
                        DbpediaFieldsTranslation.objects.filter(master=dbfield).delete()
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   211
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   212
                    consolidated_trans = {}
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   213
                    for lang,label in labels.iteritems():
240
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   214
                        consolidated_trans[lang] = [label,(ref_abstract, lang==ref_abstract_lang)]
135
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   215
                    for lang,abstract in abstracts.iteritems():
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   216
                        if lang in consolidated_trans:
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   217
                            consolidated_trans[lang][1] = abstract
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   218
                        else:
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   219
                            consolidated_trans[lang] = [(ref_label, lang==ref_label_lang), abstract]
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   220
135
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   221
                    for lang, trans in consolidated_trans.iteritems():
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   222
                        label, abstract = tuple(trans)
240
c8627191f2d7 add indication that the labels has been translated, and use sparql requests instead of full n3 download
ymh <ymh.work@gmail.com>
parents: 135
diff changeset
   223
                        DbpediaFieldsTranslation.objects.create(master=dbfield, language_code=lang, label=label[0], is_label_translated=label[1], abstract=abstract[0], is_abstract_translated=abstract[1])
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   224
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   225
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   226
            except Exception as e:
279
177b508612f4 add, configure and correct hdalab to installed apps
cavaliet
parents: 272
diff changeset
   227
                if tag.dbpedia_uri:
177b508612f4 add, configure and correct hdalab to installed apps
cavaliet
parents: 272
diff changeset
   228
                    print "\nError processing resource %s : %s" %(tag.dbpedia_uri,unicode(e))
177b508612f4 add, configure and correct hdalab to installed apps
cavaliet
parents: 272
diff changeset
   229
                else:
177b508612f4 add, configure and correct hdalab to installed apps
cavaliet
parents: 272
diff changeset
   230
                    print "\nError processing resource %s" % unicode(e)
135
dd6578e36a57 translate interface
ymh <ymh.work@gmail.com>
parents: 119
diff changeset
   231
                traceback.print_exception(type(e), e, sys.exc_info()[2])
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   232
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   233
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   234
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   235
119
e3ebe3545f72 first implementation of django version.
ymh <ymh.work@gmail.com>
parents:
diff changeset
   236
693
09e00f38d177 Add hdabo/hdalab documentations
ymh <ymh.work@gmail.com>
parents: 571
diff changeset
   237