Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

Common Crawl Foundation

Team
non-profit
Verified
https://commoncrawl.org
commoncrawl
commoncrawl
Activity Feed

AI & ML interests

Crawled data and metadata

Recent Activity

malteos  updated a bucket 1 day ago
commoncrawl/commoncrawl
parismic  updated a dataset 1 day ago
commoncrawl/statistics
malteos  new activity 4 days ago
commoncrawl/commonlid-results:Add supported_languages to summary.json for cov. leaderboard view
View all activity

Thom Vaughan's profile picturePedro Ortiz Suarez's profile picturePaul Lazar's profile pictureGreg Lindahl's profile pictureFord H's profile pictureJen English's profile pictureSebastian Nagel's profile pictureLaurie Burchell's profile pictureHande Celikkanat's profile picturemalteos's profile pictureThijs Dalhuijsen's profile pictureLuca's profile pictureCatherine Arnett's profile pictureMichael Paris's profile picture

commoncrawl 's datasets 8

commoncrawl/statistics

Viewer • Updated 1 day ago • 631k • 603 • 26

commoncrawl/commonlid-results

Preview • Updated 20 days ago • 808

commoncrawl/citations

Viewer • Updated Apr 2 • 9.18k • 127 • 2

commoncrawl/CommonLID

Viewer • Updated Feb 10 • 373k • 243 • 52

commoncrawl/gneissweb-annotation-host-testing-v1

Viewer • Updated Dec 11, 2025 • 617M • 95

commoncrawl/gneissweb-annotation-url-testing-v1

Viewer • Updated Dec 10, 2025 • 11.5B • 6.32k

commoncrawl/host-index-testing-v2

Preview • Updated Nov 10, 2025 • 33.4k

commoncrawl/eot2024_hostlevel_logs

Viewer • Updated Oct 9, 2024 • 271k • 5 • 1
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs