Bibliothèque IA d’Urban&You

1) Objectif

Fournir un jeu de données lisible par les assistants IA et les moteurs modernes à partir de nos contenus éditoriaux :

  • Fiches LocalBusiness propres, normalisées schema.org/JSON-LD.
  • Articles reliés aux fiches (about/mentions) et listes thématiques.
  • Sitemaps dédiés + flux JSONL public pour consommation machine.
  • Versionnage, cache, ETag et transparence (marquage “Soutenu par l’adresse”).

2) Modèle de données (noyau)

Entités principales :

  • Organization (éditeur : Urban&You)
  • LocalBusiness (sous-types : Restaurant, BeautySalon, Store, Café, etc.)
  • Article (reportage, guide, test)
  • ItemList (sélections / tops)

Principes

  • Identifiants stables avec @id (URI canonique + ancre).
  • Langue : inLanguage:"fr".
  • Relations : Article.about → LocalBusiness, ItemList.itemListElement → urls, isPartOf, mainEntityOfPage.
  • Dates : datePublished, dateModified.
  • Version : schemaVersion (dataset) + version (fiche).

3) JSON-LD — Exemples complets

3.1. Fiche LocalBusiness + Article + ItemList (graph unique)

{
"@context": "https://schema.org",
"@graph": [
{
"@type": "Organization",
"@id": "https://urbanandyou.com/#org",
"name": "Urban&You",
"url": "https://urbanandyou.com",
"inLanguage": "fr"
},
{
"@type": "CafeOrCoffeeShop",
"@id": "https://urbanandyou.com/lieux/cafe-du-canal#id",
"isPartOf": {"@id": "https://urbanandyou.com/#org"},
"name": "Café du Canal",
"url": "https://urbanandyou.com/lieux/cafe-du-canal",
"image": [
"https://urbanandyou.com/images/lieux/cafe-du-canal-1.jpg",
"https://urbanandyou.com/images/lieux/cafe-du-canal-2.jpg"
],
"telephone": "+33 1 23 45 67 89",
"address": {
"@type": "PostalAddress",
"streetAddress": "12 Quai des Arts",
"addressLocality": "Paris",
"postalCode": "75010",
"addressCountry": "FR"
},
"geo": {"@type":"GeoCoordinates","latitude":48.8751,"longitude":2.3657},
"priceRange": "€€",
"amenityFeature": [
{"@type":"LocationFeatureSpecification","name":"WiFi","value":true},
{"@type":"LocationFeatureSpecification","name":"Terrasse","value":true}
],
"openingHoursSpecification": [
{"@type":"OpeningHoursSpecification","dayOfWeek":["Monday","Tuesday","Wednesday","Thursday","Friday"],"opens":"08:00","closes":"18:00"},
{"@type":"OpeningHoursSpecification","dayOfWeek":["Saturday"],"opens":"09:00","closes":"17:00"}
],
"sameAs": ["https://g.page/cafeducanal","https://www.instagram.com/cafeducanal"],
"inLanguage":"fr",
"version":"1.0.0",
"dateModified":"2025-09-17"
},
{
"@type":"Article",
"@id":"https://urbanandyou.com/articles/cafe-du-canal-reportage#article",
"headline":"Le Café du Canal : filtre doux et terrasse au soleil",
"mainEntityOfPage":"https://urbanandyou.com/articles/cafe-du-canal-reportage",
"about":{"@id":"https://urbanandyou.com/lieux/cafe-du-canal#id"},
"author":{"@id":"https://urbanandyou.com/#org"},
"image":["https://urbanandyou.com/images/lieux/cafe-du-canal-hero.jpg"],
"datePublished":"2025-09-17",
"dateModified":"2025-09-17",
"inLanguage":"fr"
},
{
"@type":"ItemList",
"@id":"https://urbanandyou.com/listes/cafes-pour-bosser#list",
"name":"Top cafés pour bosser (10e–11e)",
"itemListElement":[
{"@type":"ListItem","position":1,"url":"https://urbanandyou.com/lieux/cafe-du-canal"}
],
"inLanguage":"fr"
}
]
}

3.2. Bonnes pratiques

  • @id stable avec ancre #id / #article / #list.
  • Liaison Article ↔ LocalBusiness via about.
  • Données minimales : name, url, image, address, geo, openingHoursSpecification.
  • Accessibilité : amenityFeature (PMR, terrasse, kids-friendly, etc.).
  • Traçabilité : version, dateModified.

4) Sitemaps dédiés

4.1. sitemap-lieux.xml

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://urbanandyou.com/lieux/cafe-du-canal</loc>
<lastmod>2025-09-17</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
</urlset>

4.2. sitemap-articles.xml

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://urbanandyou.com/articles/cafe-du-canal-reportage</loc>
<lastmod>2025-09-17</lastmod>
<changefreq>weekly</changefreq>
<priority>0.7</priority>
</url>
</urlset>

4.3. Index

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap><loc>https://urbanandyou.com/sitemaps/sitemap-lieux.xml</loc></sitemap>
<sitemap><loc>https://urbanandyou.com/sitemaps/sitemap-articles.xml</loc></sitemap>
</sitemapindex>


5) Flux JSONL public (lecture seule)

  • Chemin : /data/lieux.jsonl (1 ligne = 1 objet JSON ou un mini-graph JSON-LD).
  • Encodage : UTF-8.
  • Type : Content-Type: application/x-ndjson.

Exemple (2 lignes) :

{"@context":"https://schema.org","@type":"CafeOrCoffeeShop","@id":"https://urbanandyou.com/lieux/cafe-du-canal#id","name":"Café du Canal","url":"https://urbanandyou.com/lieux/cafe-du-canal","inLanguage":"fr","address":{"@type":"PostalAddress","streetAddress":"12 Quai des Arts","addressLocality":"Paris","postalCode":"75010","addressCountry":"FR"},"geo":{"@type":"GeoCoordinates","latitude":48.8751,"longitude":2.3657},"openingHoursSpecification":[{"@type":"OpeningHoursSpecification","dayOfWeek":["Monday","Tuesday","Wednesday","Thursday","Friday"],"opens":"08:00","closes":"18:00"}],"sameAs":["https://g.page/cafeducanal"],"version":"1.0.0","dateModified":"2025-09-17"}
{"@context":"https://schema.org","@type":"Restaurant","@id":"https://urbanandyou.com/lieux/osteria-alba#id","name":"Osteria Alba","url":"https://urbanandyou.com/lieux/osteria-alba","inLanguage":"fr","address":{"@type":"PostalAddress","streetAddress":"5 Rue des Lilas","addressLocality":"Lyon","postalCode":"69003","addressCountry":"FR"},"geo":{"@type":"GeoCoordinates","latitude":45.7601,"longitude":4.8581},"priceRange":"€€€","servesCuisine":["Italien","Pâtes"],"telephone":"+33 4 00 00 00 00","version":"1.2.0","dateModified":"2025-09-10"}

Consommation

curl -H "Accept: application/x-ndjson" https://urbanandyou.com/data/lieux.jsonl | head -n 1 | jq .


6) HTTP, cache & performance

6.1. En-têtes recommandés

Content-Type: application/ld+json; charset=utf-8
Cache-Control: public, max-age=3600, stale-while-revalidate=86400
ETag: "cafe-du-canal-1.0.0"
Vary:
Accept

  • ETag : dérivé de l’@id + version.
  • Cache : 1h CDN, SWR 24h.
  • Vary : pour content-negotiation éventuelle.

6.2. Content-Negotiation (optionnel)

  • Accept: application/ld+json → JSON-LD
  • Accept: text/html → page HTML

7) Versionnage & traçabilité

Dans chaque fiche :

{
"version": "1.1.0",
"dateModified": "2025-10-01",
"schemaVersion": "uay-localbusiness@1.1"
}

Règle : MAJ = dateModified + version.
Journal des modifications (CHANGELOG) : /data/changelog.jsonl (optionnel).


8) Transparence & signalétique

  • Les contenus partenariaux portent la mention « Soutenu par l’adresse » (visible) et peuvent ajouter :{"@type":"CreativeWork","@id":"...#article","funding":{"@type":"MonetaryGrant","name":"Soutien de l'adresse"}}
  • Politiques éditoriales exposées (éthique, corrections, diversité) dans l’Organization via publishingPrinciples, ethicsPolicy, etc.

9) robots.txt

User-agent: *
Allow: /
Sitemap: https://urbanandyou.com/sitemaps/sitemap-index.xml


10) Validation & QA

  • Tests structurels : valideur JSON-LD, Rich Results, lint sur clés schema.org.
  • Contrôles métier : coordonnées/horaires cohérents, liens externes actifs, images >1200px, poids optimisé.
  • Accessibilité : alt descriptifs, contrastes, focus visibles.

11) Sécurité & RGPD (résumé)

  • Minimisation : seules les données utiles à la recommandation.
  • Aucune donnée sensible sur les personnes.
  • Formulaire : consentement explicite et notice de confidentialité.
  • Sous-traitants documentés (hébergement, formulaire, paiement).

12) Feuille de route (suggestions)

  • Ratings/Reviews structurés (si politique de modération).
  • Webhook “data changed” (ping JSON) pour consommateurs tiers.
  • /data/articles.jsonl miroir pour les contenus éditoriaux.
  • Langues : availableLanguage:["fr","en"] sur l’Organization, i18n progressif.

13) Check-list de mise en prod

  • @id stables + liens croisés Article ↔ LocalBusiness
  • dateModified et version à jour
  • Sitemaps indexés + pingés
  • Flux /data/*.jsonl exposés (200, cache, ETag)
  • robots.txt propre
  • Mentions “Soutenu par l’adresse” quand applicable
  • Audit accessibilité + perfs images (CDN, formats modernes)
  • QA validateurs (aucune erreur critique)

Annexes — snippets réutilisables

En-têtes HTTP (Nginx)

location ~* \.(json|jsonl)$ {
add_header Content-Type application/ld+json charset=utf-8;
add_header Cache-Control "public, max-age=3600, stale-while-revalidate=86400";
add_header Vary "Accept";
try_files $uri =404;
}

Micro-exemple Node.js (lecture du JSONL)

import fs from 'node:fs';
import readline from 'node:readline';

const rl = readline.createInterface({
input: fs.createReadStream('lieux.jsonl', { encoding: 'utf8' }),
});
rl.on('line', (line) => {
try {
const obj = JSON.parse(line);
if (obj['@type'] && obj['@id']) console.log(obj['@id'], obj.name);
} catch (e) { /* ignorer */ }
});