From 4a334d9cda5345069af076f17f8f887d652b61b8 Mon Sep 17 00:00:00 2001 From: Semen Frolov Date: Tue, 21 Apr 2026 21:30:40 +0300 Subject: [PATCH 01/73] docs(message-brokers): add FastStream and Message Brokers category Co-Authored-By: Claude Opus 4.7 (1M context) --- README.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/README.md b/README.md index 407ddf9..63cd8f7 100644 --- a/README.md +++ b/README.md @@ -71,6 +71,7 @@ An opinionated list of Python frameworks, libraries, tools, and resources. - [DevOps Tools](#devops-tools) - [Distributed Computing](#distributed-computing) - [Task Queues](#task-queues) +- [Message Brokers](#message-brokers) - [Job Schedulers](#job-schedulers) - [Logging](#logging) - [Network Virtualization](#network-virtualization) @@ -737,6 +738,12 @@ _Libraries for working with task queues._ - [huey](https://github.com/coleifer/huey) - Little multi-threaded task queue. - [rq](https://github.com/rq/rq) - Simple job queues for Python. +## Message Brokers + +_Libraries for working with message brokers and event streaming._ + +- [faststream](https://github.com/ag2ai/faststream) - A framework for building asynchronous services over Apache Kafka, RabbitMQ, NATS, and Redis. + ## Job Schedulers _Libraries for scheduling jobs._ From 0a3205378d5590e0ad348b961416d0c0c88b87dd Mon Sep 17 00:00:00 2001 From: Basit Rahim Date: Thu, 23 Apr 2026 11:28:26 +0500 Subject: [PATCH 02/73] Added simulation library --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 2c6956f..16699f5 100644 --- a/README.md +++ b/README.md @@ -543,6 +543,7 @@ _Libraries for scientific computing. Also see [Python-for-Scientists](https://gi - [pydy](https://github.com/pydy/pydy) - Short for Python Dynamics, used to assist with workflow in the modeling of dynamic motion. - [PythonRobotics](https://github.com/AtsushiSakai/PythonRobotics) - This is a compilation of various robotics algorithms with visualizations. - Simulation and Modeling + - [mesa](https://github.com/projectmesa/mesa) - An agent-based modeling framework for building, analyzing, and visualizing complex system simulations. - [pathsim](https://github.com/pathsim/pathsim) - A block-based system modeling and simulation framework with a browser-based visual editor. - [pymc](https://github.com/pymc-devs/pymc) - Probabilistic programming and Bayesian modeling in Python. - [simpy](https://gitlab.com/team-simpy/simpy) - A process-based discrete-event simulation framework. From 427dfc000310d618ecac57bafce66b62a134961e Mon Sep 17 00:00:00 2001 From: Semen Frolov <148821259+vvlrff@users.noreply.github.com> Date: Thu, 23 Apr 2026 22:45:50 +0300 Subject: [PATCH 03/73] Rename Message Brokers to Messaging Updated 'Message Brokers' section to 'Messaging' and added MQTT to the faststream description. --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 77e829d..f561de3 100644 --- a/README.md +++ b/README.md @@ -71,7 +71,7 @@ An opinionated list of Python frameworks, libraries, tools, and resources. - [DevOps Tools](#devops-tools) - [Distributed Computing](#distributed-computing) - [Task Queues](#task-queues) -- [Message Brokers](#message-brokers) +- [Messaging](#messaging) - [Job Schedulers](#job-schedulers) - [Logging](#logging) - [Network Virtualization](#network-virtualization) @@ -739,11 +739,11 @@ _Libraries for working with task queues._ - [huey](https://github.com/coleifer/huey) - Little multi-threaded task queue. - [rq](https://github.com/rq/rq) - Simple job queues for Python. -## Message Brokers +## Messaging _Libraries for working with message brokers and event streaming._ -- [faststream](https://github.com/ag2ai/faststream) - A framework for building asynchronous services over Apache Kafka, RabbitMQ, NATS, and Redis. +- [faststream](https://github.com/ag2ai/faststream) - A framework for building asynchronous services over Apache Kafka, RabbitMQ, NATS, MQTT and Redis. ## Job Schedulers From 39b1476ac47ab573b8c27f15f187a0821907b075 Mon Sep 17 00:00:00 2001 From: Morteza Hosseini Date: Fri, 24 Apr 2026 12:33:01 +0100 Subject: [PATCH 04/73] docs(ai-agents): add OpenAI's framework for building AI agents Co-authored-by: Copilot --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 2c6956f..e441a5c 100644 --- a/README.md +++ b/README.md @@ -141,6 +141,7 @@ _Libraries for building AI applications, LLM integrations, and autonomous agents - [dspy](https://github.com/stanfordnlp/dspy) - A framework for programming, not prompting, language models. - [hermes-agent](https://github.com/nousresearch/hermes-agent) - An adaptive AI agent framework that grows with you. - [langchain](https://github.com/langchain-ai/langchain) - Building applications with LLMs through composability. + - [openai-agents](https://github.com/openai/openai-agents-python) - OpenAI's framework for building and managing AI agents. - [pydantic-ai](https://github.com/pydantic/pydantic-ai) - A Python agent framework for building generative AI applications with structured schemas. - [TradingAgents](https://github.com/TauricResearch/TradingAgents) - A multi-agents LLM financial trading framework. - Data Layer From e386fbb0e6d7959cdec9943a829bbceb4e6591c0 Mon Sep 17 00:00:00 2001 From: cak Date: Fri, 24 Apr 2026 08:58:59 -0400 Subject: [PATCH 05/73] add Web Security section and secure --- README.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/README.md b/README.md index 2c6956f..79ea505 100644 --- a/README.md +++ b/README.md @@ -114,6 +114,7 @@ An opinionated list of Python frameworks, libraries, tools, and resources. - [Cryptography](#cryptography) - [Penetration Testing](#penetration-testing) +- [Web Security](#web-security) **Miscellaneous** @@ -1080,6 +1081,12 @@ _Frameworks and tools for penetration testing._ - [sherlock](https://github.com/sherlock-project/sherlock) - Hunt down social media accounts by username across social networks. - [sqlmap](https://github.com/sqlmapproject/sqlmap) - Automatic SQL injection and database takeover tool. +## Web Security + +_Libraries for application-layer web security._ + +- [secure](https://github.com/TypeError/secure) - HTTP security headers for Python web applications with ASGI and WSGI middleware. + **Miscellaneous** ## Hardware From 385402159949fd247067387f404d29726999e79f Mon Sep 17 00:00:00 2001 From: Jinyang Date: Sun, 26 Apr 2026 20:12:09 +0400 Subject: [PATCH 06/73] Add OpenChronicle to the list of AI frameworks --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 17d5a9b..d80681f 100644 --- a/README.md +++ b/README.md @@ -144,6 +144,7 @@ _Libraries for building AI applications, LLM integrations, and autonomous agents - [hermes-agent](https://github.com/nousresearch/hermes-agent) - An adaptive AI agent framework that grows with you. - [langchain](https://github.com/langchain-ai/langchain) - Building applications with LLMs through composability. - [openai-agents](https://github.com/openai/openai-agents-python) - OpenAI's framework for building and managing AI agents. + - [OpenChronicle](https://github.com/Einsia/OpenChronicle) - Open-source, local-first memory for any tool-capable LLM agent. - [pydantic-ai](https://github.com/pydantic/pydantic-ai) - A Python agent framework for building generative AI applications with structured schemas. - [TradingAgents](https://github.com/TauricResearch/TradingAgents) - A multi-agents LLM financial trading framework. - Data Layer From b582db93f39cfa313bcee0912d87857207eb4ed2 Mon Sep 17 00:00:00 2001 From: Joseba Fuentes Date: Mon, 27 Apr 2026 17:56:45 +0200 Subject: [PATCH 07/73] Merge pull request #3071 from Tlaloc-Es/patch-1 Add KillPy to environment management section --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 17d5a9b..11932e2 100644 --- a/README.md +++ b/README.md @@ -1029,6 +1029,7 @@ _Libraries for working with dates and times._ _Libraries for Python version and virtual environment management._ +- [KillPy](https://github.com/Tlaloc-Es/killpy) - Analyze, detect, and clean unused Python environments and pipx packages. - [pyenv](https://github.com/pyenv/pyenv) - Simple Python version management. - [pyenv-win](https://github.com/pyenv-win/pyenv-win) - Pyenv for Windows. - [uv](https://github.com/astral-sh/uv) - An extremely fast Python version, package and project manager, written in Rust. From aa25d61e29cc0bc555e56ba880f9dde3f0189ae1 Mon Sep 17 00:00:00 2001 From: Vinta Chen Date: Fri, 1 May 2026 23:07:03 +0800 Subject: [PATCH 08/73] add also see awesome-python-testing to Testing --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 0e92c24..799fa07 100644 --- a/README.md +++ b/README.md @@ -620,7 +620,7 @@ _Tools of static analysis, linters and code quality checkers. Also see [awesome- ## Testing -_Libraries for testing codebases and generating test data._ +_Libraries for testing codebases and generating test data. Also see [awesome-python-testing](https://github.com/cleder/awesome-python-testing)._ - Frameworks - [hypothesis](https://github.com/HypothesisWorks/hypothesis) - Hypothesis is an advanced Quickcheck style property based testing library. From 42a04dcd55846f03511d2c7a5efe8ec2a0c80d05 Mon Sep 17 00:00:00 2001 From: Vinta Chen Date: Fri, 1 May 2026 23:08:39 +0800 Subject: [PATCH 09/73] add awesome-pytest under pytest --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 799fa07..84c137a 100644 --- a/README.md +++ b/README.md @@ -625,6 +625,7 @@ _Libraries for testing codebases and generating test data. Also see [awesome-pyt - Frameworks - [hypothesis](https://github.com/HypothesisWorks/hypothesis) - Hypothesis is an advanced Quickcheck style property based testing library. - [pytest](https://github.com/pytest-dev/pytest) - A mature full-featured Python testing tool. + - [awesome-pytest](https://github.com/augustogoulart/awesome-pytest) - [robotframework](https://github.com/robotframework/robotframework) - A generic test automation framework. - [scanapi](https://github.com/scanapi/scanapi) - Automated Testing and Documentation for your REST API. - [unittest](https://docs.python.org/3/library/unittest.html) - (Python standard library) Unit testing framework. From ccd4fb7591e25aa08918def41f4432793a12b9fa Mon Sep 17 00:00:00 2001 From: Vinta Chen Date: Fri, 1 May 2026 23:13:58 +0800 Subject: [PATCH 10/73] List https://github.com/wsvincent/awesome-django instead --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 84c137a..51ae9d1 100644 --- a/README.md +++ b/README.md @@ -229,7 +229,7 @@ _Traditional full stack web frameworks. Also see [Web APIs](#web-apis)._ - Synchronous - [bottle](https://github.com/bottlepy/bottle) - A fast and simple micro-framework distributed as a single file with no dependencies. - [django](https://github.com/django/django) - The most popular web framework in Python. - - [awesome-django](https://github.com/shahraizali/awesome-django) + - [awesome-django](https://github.com/wsvincent/awesome-django) - [flask](https://github.com/pallets/flask) - A microframework for Python. - [awesome-flask](https://github.com/humiaozuzu/awesome-flask) - [pyramid](https://github.com/Pylons/pyramid) - A small, fast, down-to-earth, open source Python web framework. From d9f26a86357b4b9fe2d498f8ef3fac8b68005931 Mon Sep 17 00:00:00 2001 From: Vinta Chen Date: Sat, 2 May 2026 01:53:19 +0800 Subject: [PATCH 11/73] Improve SEO/AEO discovery surface for awesome-python.com (#3103) * update gitignore * feat: tighten homepage metadata * fix: trim generated HTML whitespace * feat(website): add discovery files and markdown alternate * feat(website): add sitemap lastmod * feat(seo): add Content-Signal directive to robots.txt Signals search, ai-input, and ai-train to crawlers via the experimental Content-Signal header in robots.txt. Co-Authored-By: Claude --------- Co-authored-by: Claude --- .gitignore | 14 +-- README.md | 2 +- website/build.py | 69 ++++++++++++++- website/templates/base.html | 35 ++++---- website/tests/test_build.py | 167 ++++++++++++++++++++++++++++++++++++ 5 files changed, 260 insertions(+), 27 deletions(-) diff --git a/.gitignore b/.gitignore index ca26a6e..0d9f410 100644 --- a/.gitignore +++ b/.gitignore @@ -10,12 +10,12 @@ __pycache__/ website/output/ website/data/ -# claude code -.claude/skills/ -.gstack/ -.playwright-cli/ -.superpowers/ -skills-lock.json +# planning docs +docs/ -# codex +# agents .agents/ +.claude/skills/ +.superpowers/ +.playwright-cli/ +skills-lock.json diff --git a/README.md b/README.md index 51ae9d1..107b685 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Awesome Python -An opinionated list of Python frameworks, libraries, tools, and resources. +An opinionated guide to the best Python frameworks, libraries, tools, and resources. # **Sponsors** diff --git a/website/build.py b/website/build.py index c223ef1..8fb5f38 100644 --- a/website/build.py +++ b/website/build.py @@ -4,6 +4,8 @@ import json import re import shutil +import xml.etree.ElementTree as ET +from collections.abc import Sequence from datetime import UTC, datetime from pathlib import Path from typing import Any @@ -12,6 +14,9 @@ from jinja2 import Environment, FileSystemLoader from readme_parser import ParsedGroup, ParsedSection, parse_readme, parse_sponsors GITHUB_REPO_URL_RE = re.compile(r"^https?://github\.com/([^/]+/[^/]+?)(?:\.git)?/?$") +SITE_URL = "https://awesome-python.com/" +SITEMAP_URL = f"{SITE_URL}sitemap.xml" +SITEMAP_NS = "http://www.sitemaps.org/schemas/sitemap/0.9" SOURCE_TYPE_DOMAINS = { "docs.python.org": "Built-in", @@ -67,6 +72,59 @@ def sort_entries(entries: list[dict]) -> list[dict]: return sorted(entries, key=sort_key) +def build_robots_txt() -> str: + return ( + "User-agent: *\n" + "Content-Signal: search=yes, ai-input=yes, ai-train=yes\n" + "Allow: /\n" + "\n" + f"Sitemap: {SITEMAP_URL}\n" + ) + + +def write_sitemap_xml(path: Path, urls: Sequence[tuple[str, str]]) -> None: + ET.register_namespace("", SITEMAP_NS) + urlset = ET.Element(f"{{{SITEMAP_NS}}}urlset") + for url, lastmod in urls: + url_el = ET.SubElement(urlset, f"{{{SITEMAP_NS}}}url") + loc_el = ET.SubElement(url_el, f"{{{SITEMAP_NS}}}loc") + loc_el.text = url + lastmod_el = ET.SubElement(url_el, f"{{{SITEMAP_NS}}}lastmod") + lastmod_el.text = lastmod + + ET.ElementTree(urlset).write(path, encoding="utf-8", xml_declaration=True) + with path.open("ab") as f: + f.write(b"\n") + + +def top_level_heading_text(line: str) -> str | None: + stripped = line.strip() + if not stripped.startswith("# "): + return None + return stripped.removeprefix("#").strip().strip("#").strip().strip("*").strip() + + +def remove_sponsors_section(markdown: str) -> str: + lines = markdown.splitlines(keepends=True) + start_idx = None + for i, line in enumerate(lines): + heading = top_level_heading_text(line) + if heading and heading.lower() == "sponsors": + start_idx = i + break + + if start_idx is None: + return markdown + + end_idx = len(lines) + for i, line in enumerate(lines[start_idx + 1 :], start=start_idx + 1): + if top_level_heading_text(line): + end_idx = i + break + + return "".join(lines[:start_idx] + lines[end_idx:]) + + def extract_entries( categories: list[ParsedSection], groups: list[ParsedGroup], @@ -131,6 +189,7 @@ def build(repo_root: Path) -> None: categories = [cat for g in parsed_groups for cat in g["categories"]] total_entries = sum(c["entry_count"] for c in categories) entries = extract_entries(categories, parsed_groups) + build_date = datetime.now(UTC) stars_data = load_stars(website / "data" / "github_stars.json") @@ -155,6 +214,8 @@ def build(repo_root: Path) -> None: env = Environment( loader=FileSystemLoader(website / "templates"), autoescape=True, + trim_blocks=True, + lstrip_blocks=True, ) site_dir = website / "output" @@ -171,7 +232,7 @@ def build(repo_root: Path) -> None: total_entries=total_entries, total_categories=len(categories), repo_stars=repo_stars, - build_date=datetime.now(UTC).strftime("%B %d, %Y"), + build_date=build_date.strftime("%B %d, %Y"), sponsors=sponsors, ), encoding="utf-8", @@ -182,7 +243,11 @@ def build(repo_root: Path) -> None: if static_src.exists(): shutil.copytree(static_src, static_dst, dirs_exist_ok=True) - (site_dir / "llms.txt").write_text(readme_text, encoding="utf-8") + markdown_index = remove_sponsors_section(readme_text) + (site_dir / "robots.txt").write_text(build_robots_txt(), encoding="utf-8") + write_sitemap_xml(site_dir / "sitemap.xml", [(SITE_URL, build_date.date().isoformat())]) + (site_dir / "index.md").write_text(markdown_index, encoding="utf-8") + (site_dir / "llms.txt").write_text(markdown_index, encoding="utf-8") print(f"Built single page with {len(parsed_groups)} groups, {len(categories)} categories") print(f"Total entries: {total_entries}") diff --git a/website/templates/base.html b/website/templates/base.html index 34546e7..af11209 100644 --- a/website/templates/base.html +++ b/website/templates/base.html @@ -1,26 +1,27 @@ + {% set default_meta_title = "Awesome Python" %} + {% set default_meta_description = "An opinionated guide to the best Python frameworks, libraries, and tools. Explore " ~ (entries | length) ~ " curated projects across " ~ total_categories ~ " categories, from AI and agents to data science and web development." %} + {% set canonical_url = "https://awesome-python.com/" %} + {% set social_image_url = "https://awesome-python.com/static/og-image.png" %} + {% set meta_title %}{% block title %}{{ default_meta_title }}{% endblock %}{% endset %} + {% set meta_description %}{% block description %}{{ default_meta_description }}{% endblock %}{% endset %} - {% block title %}Awesome Python{% endblock %} - - + {{ meta_title | trim }} + + + - - - - - + + + + + + + + diff --git a/website/tests/test_build.py b/website/tests/test_build.py index 0b22609..1feab77 100644 --- a/website/tests/test_build.py +++ b/website/tests/test_build.py @@ -3,6 +3,9 @@ import json import shutil import textwrap +import xml.etree.ElementTree as ET +from datetime import UTC, date, datetime +from html.parser import HTMLParser from pathlib import Path from build import ( @@ -15,6 +18,40 @@ from build import ( ) from readme_parser import parse_readme, slugify + +class HeadMetadataParser(HTMLParser): + def __init__(self): + super().__init__() + self.title_count = 0 + self.title = "" + self.meta_by_name = {} + self.meta_by_property = {} + self.links_by_rel = {} + self._in_title = False + + def handle_starttag(self, tag, attrs): + attrs = dict(attrs) + if tag == "title": + self.title_count += 1 + self._in_title = True + elif tag == "meta": + if "name" in attrs: + self.meta_by_name[attrs["name"]] = attrs.get("content", "") + if "property" in attrs: + self.meta_by_property[attrs["property"]] = attrs.get("content", "") + elif tag == "link" and attrs.get("rel"): + for rel in attrs["rel"].split(): + self.links_by_rel[rel] = attrs.get("href", "") + + def handle_endtag(self, tag): + if tag == "title": + self._in_title = False + + def handle_data(self, data): + if self._in_title: + self.title += data + + # --------------------------------------------------------------------------- # slugify # --------------------------------------------------------------------------- @@ -72,6 +109,11 @@ class TestBuild: encoding="utf-8", ) + def _copy_real_templates(self, tmp_path): + real_tpl = Path(__file__).parent / ".." / "templates" + tpl_dir = tmp_path / "website" / "templates" + shutil.copytree(real_tpl, tpl_dir) + def test_build_creates_single_page(self, tmp_path): readme = textwrap.dedent("""\ # Awesome Python @@ -114,6 +156,97 @@ class TestBuild: # No category sub-pages assert not (site / "categories").exists() + def test_build_creates_root_discovery_files(self, tmp_path): + readme = textwrap.dedent("""\ + # Awesome Python + + Intro. + + --- + + ## Widgets + + - [w1](https://example.com) - A widget. + + # Contributing + + Help! + """) + self._make_repo(tmp_path, readme) + start_date = datetime.now(UTC).date() + build(tmp_path) + end_date = datetime.now(UTC).date() + + site = tmp_path / "website" / "output" + robots = (site / "robots.txt").read_text(encoding="utf-8") + assert robots == ( + "User-agent: *\n" + "Content-Signal: search=yes, ai-input=yes, ai-train=yes\n" + "Allow: /\n" + "\n" + "Sitemap: https://awesome-python.com/sitemap.xml\n" + ) + + sitemap = ET.parse(site / "sitemap.xml") + root = sitemap.getroot() + ns = {"sitemap": "http://www.sitemaps.org/schemas/sitemap/0.9"} + locs = [loc.text for loc in root.findall("sitemap:url/sitemap:loc", ns)] + lastmods = [lastmod.text for lastmod in root.findall("sitemap:url/sitemap:lastmod", ns)] + + assert root.tag == "{http://www.sitemaps.org/schemas/sitemap/0.9}urlset" + assert locs == ["https://awesome-python.com/"] + assert len(lastmods) == 1 + assert start_date <= date.fromisoformat(lastmods[0]) <= end_date + assert all(loc.startswith("https://awesome-python.com/") for loc in locs) + assert all("?" not in loc for loc in locs) + + def test_build_creates_markdown_alternate_without_sponsors(self, tmp_path): + readme = textwrap.dedent("""\ + # Awesome Python + + Intro. + + # **Sponsors** + + - **[Sponsor](https://sponsor.example.com)**: Sponsored tool. + + > Become a sponsor: [Sponsor us](SPONSORSHIP.md). + + # Categories + + **Tools** + + - [Widgets](#widgets) + + --- + + ## Widgets + + - [w1](https://example.com) - A widget. + + # Contributing + + Help! + """) + (tmp_path / "README.md").write_text(readme, encoding="utf-8") + self._copy_real_templates(tmp_path) + + build(tmp_path) + + site = tmp_path / "website" / "output" + index_html = (site / "index.html").read_text(encoding="utf-8") + index_md = (site / "index.md").read_text(encoding="utf-8") + llms_txt = (site / "llms.txt").read_text(encoding="utf-8") + + assert '' in index_html + assert index_md == llms_txt + assert index_md.startswith("# Awesome Python\n\nIntro.\n\n# Categories") + assert "# **Sponsors**" not in index_md + assert "Sponsor" not in index_md + assert "SPONSORSHIP.md" not in index_md + assert "## Widgets" in index_md + assert "- [w1](https://example.com) - A widget." in index_md + def test_build_cleans_stale_output(self, tmp_path): readme = textwrap.dedent("""\ # T @@ -235,6 +368,40 @@ class TestBuild: # Expand content present assert "expand-content" in html + def test_index_contains_aligned_homepage_metadata(self, tmp_path): + readme = (Path(__file__).parents[2] / "README.md").read_text(encoding="utf-8") + (tmp_path / "README.md").write_text(readme, encoding="utf-8") + self._copy_real_templates(tmp_path) + + build(tmp_path) + + parsed_groups = parse_readme(readme) + categories = [cat for group in parsed_groups for cat in group["categories"]] + entries = extract_entries(categories, parsed_groups) + html = (tmp_path / "website" / "output" / "index.html").read_text(encoding="utf-8") + parser = HeadMetadataParser() + parser.feed(html) + + expected_title = "Awesome Python" + expected_description = f"An opinionated guide to the best Python frameworks, libraries, and tools. Explore {len(entries)} curated projects across {len(categories)} categories, from AI and agents to data science and web development." + expected_url = "https://awesome-python.com/" + expected_image = "https://awesome-python.com/static/og-image.png" + + assert parser.title_count == 1 + assert parser.title.strip() == expected_title + assert parser.meta_by_name["description"] == expected_description + assert parser.links_by_rel["canonical"] == expected_url + assert parser.meta_by_property["og:type"] == "website" + assert parser.meta_by_property["og:title"] == expected_title + assert parser.meta_by_property["og:description"] == expected_description + assert parser.meta_by_property["og:image"] == expected_image + assert parser.meta_by_property["og:url"] == expected_url + assert parser.meta_by_name["twitter:card"] == "summary_large_image" + assert parser.meta_by_name["twitter:title"] == expected_title + assert parser.meta_by_name["twitter:description"] == expected_description + assert parser.meta_by_name["twitter:image"] == expected_image + assert "\n Date: Sat, 2 May 2026 02:32:18 +0800 Subject: [PATCH 12/73] feat: generate llms.txt from template and annotate entries with star counts - Add llms.txt Jinja2 template with a categories_md placeholder - Extract categories body from README and inject it into the template - Annotate bullet-entry lines with GitHub star counts (N GitHub stars) for the main index.md and bare numbers for llms.txt - Add TestAnnotateEntriesWithStars unit tests Co-Authored-By: Claude --- website/build.py | 76 +++++++++++++++++++++++++++++++- website/templates/llms.txt | 9 ++++ website/tests/test_build.py | 87 ++++++++++++++++++++++++++++++++++++- 3 files changed, 169 insertions(+), 3 deletions(-) create mode 100644 website/templates/llms.txt diff --git a/website/build.py b/website/build.py index 8fb5f38..f9e3aa5 100644 --- a/website/build.py +++ b/website/build.py @@ -14,6 +14,8 @@ from jinja2 import Environment, FileSystemLoader from readme_parser import ParsedGroup, ParsedSection, parse_readme, parse_sponsors GITHUB_REPO_URL_RE = re.compile(r"^https?://github\.com/([^/]+/[^/]+?)(?:\.git)?/?$") +MARKDOWN_LINK_RE = re.compile(r"\[[^\]]+\]\(([^)\s]+)\)") +BULLET_LINE_RE = re.compile(r"^\s*-\s") SITE_URL = "https://awesome-python.com/" SITEMAP_URL = f"{SITE_URL}sitemap.xml" SITEMAP_NS = "http://www.sitemaps.org/schemas/sitemap/0.9" @@ -104,6 +106,72 @@ def top_level_heading_text(line: str) -> str | None: return stripped.removeprefix("#").strip().strip("#").strip().strip("*").strip() +LLMS_CATEGORIES_PLACEHOLDER = "{{ categories_md }}" + + +def extract_categories_body(markdown: str) -> str: + """Return content under the `# Categories` heading, excluding the heading line itself.""" + lines = markdown.splitlines(keepends=True) + start_idx = None + end_idx = len(lines) + for i, line in enumerate(lines): + heading = top_level_heading_text(line) + if heading is None: + continue + if start_idx is None and heading.lower() == "categories": + start_idx = i + 1 + while start_idx < len(lines) and lines[start_idx].strip() == "": + start_idx += 1 + elif start_idx is not None and i >= start_idx: + end_idx = i + break + if start_idx is None: + return "" + return "".join(lines[start_idx:end_idx]).rstrip() + "\n" + + +def build_llms_txt(template_text: str, readme_text: str, stars_data: dict[str, dict]) -> str: + """Render the llms.txt template by injecting the README's Categories body, then annotate stars.""" + body = extract_categories_body(readme_text).rstrip() + rendered = template_text.replace(LLMS_CATEGORIES_PLACEHOLDER, body) + return annotate_entries_with_stars(rendered, stars_data, format_stars=str) + + +def annotate_entries_with_stars( + markdown: str, + stars_data: dict[str, dict], + *, + format_stars=None, +) -> str: + """Append the star count to bullet entry lines whose first GitHub link has known star data. + + `format_stars` controls the parenthesized text. Defaults to "{N} GitHub stars". + Pass `str` for a bare number. + """ + if format_stars is None: + format_stars = lambda n: f"{n} GitHub stars" # noqa: E731 lambda-assignment + lines = markdown.splitlines(keepends=True) + out: list[str] = [] + for line in lines: + if not BULLET_LINE_RE.match(line): + out.append(line) + continue + annotated = line + for match in MARKDOWN_LINK_RE.finditer(line): + repo_key = extract_github_repo(match.group(1)) + if not repo_key: + continue + entry = stars_data.get(repo_key) + if not entry or "stars" not in entry: + continue + stripped = line.rstrip("\n") + ending = line[len(stripped):] + annotated = f"{stripped} ({format_stars(entry['stars'])}){ending}" + break + out.append(annotated) + return "".join(out) + + def remove_sponsors_section(markdown: str) -> str: lines = markdown.splitlines(keepends=True) start_idx = None @@ -243,11 +311,15 @@ def build(repo_root: Path) -> None: if static_src.exists(): shutil.copytree(static_src, static_dst, dirs_exist_ok=True) - markdown_index = remove_sponsors_section(readme_text) + markdown_index = annotate_entries_with_stars( + remove_sponsors_section(readme_text), stars_data + ) + llms_template = (website / "templates" / "llms.txt").read_text(encoding="utf-8") + llms_txt = build_llms_txt(llms_template, readme_text, stars_data) (site_dir / "robots.txt").write_text(build_robots_txt(), encoding="utf-8") write_sitemap_xml(site_dir / "sitemap.xml", [(SITE_URL, build_date.date().isoformat())]) (site_dir / "index.md").write_text(markdown_index, encoding="utf-8") - (site_dir / "llms.txt").write_text(markdown_index, encoding="utf-8") + (site_dir / "llms.txt").write_text(llms_txt, encoding="utf-8") print(f"Built single page with {len(parsed_groups)} groups, {len(categories)} categories") print(f"Total entries: {total_entries}") diff --git a/website/templates/llms.txt b/website/templates/llms.txt new file mode 100644 index 0000000..1db05c3 --- /dev/null +++ b/website/templates/llms.txt @@ -0,0 +1,9 @@ +# Awesome Python + +An opinionated guide to the best Python frameworks, libraries, tools, and resources. + +Use this curated list when you need to find a high-quality Python library or tool for tasks such as web development, data science, machine learning, AI agents, automation, testing, or DevOps. The trailing number on each entry is its star count on GitHub. + +# Categories + +{{ categories_md }} diff --git a/website/tests/test_build.py b/website/tests/test_build.py index 1feab77..32b0191 100644 --- a/website/tests/test_build.py +++ b/website/tests/test_build.py @@ -9,6 +9,7 @@ from html.parser import HTMLParser from pathlib import Path from build import ( + annotate_entries_with_stars, build, detect_source_type, extract_entries, @@ -108,6 +109,16 @@ class TestBuild: "{% endblock %}", encoding="utf-8", ) + (tpl_dir / "llms.txt").write_text( + "# Awesome Python\n" + "\n" + "Use this list to find Python tools.\n" + "\n" + "# Categories\n" + "\n" + "{{ categories_md }}\n", + encoding="utf-8", + ) def _copy_real_templates(self, tmp_path): real_tpl = Path(__file__).parent / ".." / "templates" @@ -223,6 +234,7 @@ class TestBuild: ## Widgets - [w1](https://example.com) - A widget. + - [w2](https://github.com/owner/w2) - A starred widget. # Contributing @@ -231,6 +243,13 @@ class TestBuild: (tmp_path / "README.md").write_text(readme, encoding="utf-8") self._copy_real_templates(tmp_path) + data_dir = tmp_path / "website" / "data" + data_dir.mkdir(parents=True) + stars = { + "owner/w2": {"stars": 42, "owner": "owner", "fetched_at": "2026-01-01T00:00:00+00:00"}, + } + (data_dir / "github_stars.json").write_text(json.dumps(stars), encoding="utf-8") + build(tmp_path) site = tmp_path / "website" / "output" @@ -239,13 +258,23 @@ class TestBuild: llms_txt = (site / "llms.txt").read_text(encoding="utf-8") assert '' in index_html - assert index_md == llms_txt assert index_md.startswith("# Awesome Python\n\nIntro.\n\n# Categories") assert "# **Sponsors**" not in index_md assert "Sponsor" not in index_md assert "SPONSORSHIP.md" not in index_md assert "## Widgets" in index_md assert "- [w1](https://example.com) - A widget." in index_md + assert "- [w2](https://github.com/owner/w2) - A starred widget. (42 GitHub stars)" in index_md + + assert llms_txt.startswith("# Awesome Python\n") + assert "# Categories" in llms_txt + assert "Use this curated list" in llms_txt + assert "## Widgets" in llms_txt + assert "- [w1](https://example.com) - A widget." in llms_txt + assert "- [w2](https://github.com/owner/w2) - A starred widget. (42)" in llms_txt + assert "{{ categories_md }}" not in llms_txt + assert "# Contributing" not in llms_txt + assert "Help!" not in llms_txt def test_build_cleans_stale_output(self, tmp_path): readme = textwrap.dedent("""\ @@ -604,3 +633,59 @@ class TestExtractEntries: categories = [c for g in groups for c in g["categories"]] entries = extract_entries(categories, groups) assert entries[0]["source_type"] == "Built-in" + + +# --------------------------------------------------------------------------- +# annotate_entries_with_stars +# --------------------------------------------------------------------------- + + +class TestAnnotateEntriesWithStars: + def test_appends_star_count_to_bullet(self): + markdown = "- [foo](https://github.com/owner/foo) - A foo.\n" + stars = {"owner/foo": {"stars": 123, "owner": "owner"}} + assert annotate_entries_with_stars(markdown, stars) == ( + "- [foo](https://github.com/owner/foo) - A foo. (123 GitHub stars)\n" + ) + + def test_uses_first_github_link(self): + markdown = ( + "- [foo](https://github.com/owner/foo) - A foo. " + "Also [bar](https://github.com/owner/bar).\n" + ) + stars = { + "owner/foo": {"stars": 10, "owner": "owner"}, + "owner/bar": {"stars": 99, "owner": "owner"}, + } + assert annotate_entries_with_stars(markdown, stars) == ( + "- [foo](https://github.com/owner/foo) - A foo. " + "Also [bar](https://github.com/owner/bar). (10 GitHub stars)\n" + ) + + def test_skips_entries_without_star_data(self): + markdown = "- [foo](https://github.com/owner/foo) - A foo.\n" + assert annotate_entries_with_stars(markdown, {}) == markdown + + def test_skips_non_github_links(self): + markdown = "- [foo](https://example.com) - A foo.\n" + stars = {"owner/foo": {"stars": 1, "owner": "owner"}} + assert annotate_entries_with_stars(markdown, stars) == markdown + + def test_skips_non_bullet_lines(self): + markdown = "See [foo](https://github.com/owner/foo) for details.\n" + stars = {"owner/foo": {"stars": 1, "owner": "owner"}} + assert annotate_entries_with_stars(markdown, stars) == markdown + + def test_handles_indented_bullets(self): + markdown = " - [foo](https://github.com/owner/foo)\n" + stars = {"owner/foo": {"stars": 7, "owner": "owner"}} + assert annotate_entries_with_stars(markdown, stars) == ( + " - [foo](https://github.com/owner/foo) (7 GitHub stars)\n" + ) + + def test_preserves_lines_without_trailing_newline(self): + markdown = "- [foo](https://github.com/owner/foo) - A foo." + stars = {"owner/foo": {"stars": 5, "owner": "owner"}} + assert annotate_entries_with_stars(markdown, stars) == ( + "- [foo](https://github.com/owner/foo) - A foo. (5 GitHub stars)" + ) From e11afd1730cb8fe275ada1a02fef047de947ccff Mon Sep 17 00:00:00 2001 From: Vinta Chen Date: Sat, 2 May 2026 23:31:08 +0800 Subject: [PATCH 13/73] feat(website): generate static category pages --- website/build.py | 34 +++++- website/static/main.js | 2 + website/static/style.css | 74 ++++++++++++ website/templates/base.html | 9 +- website/templates/category.html | 195 ++++++++++++++++++++++++++++++++ website/templates/index.html | 7 +- website/tests/test_build.py | 90 ++++++++++++++- 7 files changed, 399 insertions(+), 12 deletions(-) create mode 100644 website/templates/category.html diff --git a/website/build.py b/website/build.py index f9e3aa5..af644bf 100644 --- a/website/build.py +++ b/website/build.py @@ -84,6 +84,14 @@ def build_robots_txt() -> str: ) +def category_path(category: ParsedSection) -> str: + return f"/categories/{category['slug']}/" + + +def category_public_url(category: ParsedSection) -> str: + return f"{SITE_URL}categories/{category['slug']}/" + + def write_sitemap_xml(path: Path, urls: Sequence[tuple[str, str]]) -> None: ET.register_namespace("", SITEMAP_NS) urlset = ET.Element(f"{{{SITEMAP_NS}}}urlset") @@ -278,6 +286,7 @@ def build(repo_root: Path) -> None: entry["last_commit_at"] = sd.get("last_commit_at", "") entries = sort_entries(entries) + category_urls = {cat["name"]: category_path(cat) for cat in categories} env = Environment( loader=FileSystemLoader(website / "templates"), @@ -302,10 +311,27 @@ def build(repo_root: Path) -> None: repo_stars=repo_stars, build_date=build_date.strftime("%B %d, %Y"), sponsors=sponsors, + category_urls=category_urls, ), encoding="utf-8", ) + tpl_category = env.get_template("category.html") + categories_dir = site_dir / "categories" + for category in categories: + category_entries = [entry for entry in entries if category["name"] in entry["categories"]] + page_dir = categories_dir / category["slug"] + page_dir.mkdir(parents=True, exist_ok=True) + (page_dir / "index.html").write_text( + tpl_category.render( + category=category, + category_url=category_public_url(category), + entries=category_entries, + total_categories=len(categories), + ), + encoding="utf-8", + ) + static_src = website / "static" static_dst = site_dir / "static" if static_src.exists(): @@ -317,11 +343,15 @@ def build(repo_root: Path) -> None: llms_template = (website / "templates" / "llms.txt").read_text(encoding="utf-8") llms_txt = build_llms_txt(llms_template, readme_text, stars_data) (site_dir / "robots.txt").write_text(build_robots_txt(), encoding="utf-8") - write_sitemap_xml(site_dir / "sitemap.xml", [(SITE_URL, build_date.date().isoformat())]) + sitemap_date = build_date.date().isoformat() + sitemap_urls = [(SITE_URL, sitemap_date)] + [ + (category_public_url(category), sitemap_date) for category in categories + ] + write_sitemap_xml(site_dir / "sitemap.xml", sitemap_urls) (site_dir / "index.md").write_text(markdown_index, encoding="utf-8") (site_dir / "llms.txt").write_text(llms_txt, encoding="utf-8") - print(f"Built single page with {len(parsed_groups)} groups, {len(categories)} categories") + print(f"Built site with {len(parsed_groups)} groups, {len(categories)} categories") print(f"Total entries: {total_entries}") print(f"Output: {site_dir}") diff --git a/website/static/main.js b/website/static/main.js index 7353ff2..f875f8b 100644 --- a/website/static/main.js +++ b/website/static/main.js @@ -202,6 +202,8 @@ function getSortValue(row, col) { } function sortRows() { + if (!tbody) return; + const arr = Array.prototype.slice.call(rows); const col = activeSort.col; const order = activeSort.order; diff --git a/website/static/style.css b/website/static/style.css index ec395e9..2adeca3 100644 --- a/website/static/style.css +++ b/website/static/style.css @@ -376,18 +376,92 @@ kbd { } .hero-action:focus-visible, +.hero-brand-mini:focus-visible, .hero-topbar-link:focus-visible, .search:focus-visible, .filter-clear:focus-visible, .tag:focus-visible, .back-to-top:focus-visible, .no-results-clear:focus-visible, +.category-table a:focus-visible, .footer a:focus-visible, .sort-btn:focus-visible { outline: 2px solid var(--accent); outline-offset: 3px; } +.category-hero { + position: relative; + overflow: clip; + background: linear-gradient(140deg, var(--hero-bg-start) 0%, var(--hero-bg-mid) 58%, var(--hero-bg-end) 100%); + color: var(--hero-text); +} + +.category-hero-shell { + position: relative; + z-index: 1; + width: min(100%, calc(var(--shell-max) + (var(--shell-pad) * 2))); + margin: 0 auto; + padding: 1.25rem var(--shell-pad) clamp(3.75rem, 8vw, 6.75rem); + display: grid; + gap: clamp(3rem, 8vw, 5.5rem); +} + +.category-hero h1 { + font-family: var(--font-display); + font-size: clamp(3.6rem, 9vw, 7rem); + line-height: 0.9; + font-weight: 600; + text-wrap: balance; +} + +.category-subtitle { + max-width: 68ch; + margin-top: 1.1rem; + color: var(--hero-muted); + font-size: clamp(1rem, 1.8vw, 1.18rem); + text-wrap: pretty; +} + +.category-results { + padding-top: clamp(2.5rem, 5vw, 3.75rem); +} + +.category-table .col-name { + width: min(42rem, 48vw); + white-space: normal; +} + +.category-table .col-name > a { + display: inline-block; +} + +.category-row-desc { + display: block; + max-width: 68ch; + margin-top: 0.32rem; + color: var(--ink-soft); + font-size: var(--text-sm); + font-weight: 500; + line-height: 1.55; + text-wrap: pretty; +} + +.category-row-desc a { + color: var(--accent-deep); + text-decoration: underline; + text-decoration-color: var(--accent-underline); + text-underline-offset: 0.18em; +} + +.category-row-desc a:hover { + color: var(--accent); +} + +.category-table .expand-content { + padding-block: 0.25rem 0.15rem; +} + .sponsor-band { padding-block: clamp(2.5rem, 5.5vw, 4rem); background: diff --git a/website/templates/base.html b/website/templates/base.html index af11209..22b56e9 100644 --- a/website/templates/base.html +++ b/website/templates/base.html @@ -3,21 +3,24 @@ {% set default_meta_title = "Awesome Python" %} {% set default_meta_description = "An opinionated guide to the best Python frameworks, libraries, and tools. Explore " ~ (entries | length) ~ " curated projects across " ~ total_categories ~ " categories, from AI and agents to data science and web development." %} - {% set canonical_url = "https://awesome-python.com/" %} + {% set default_canonical_url = "https://awesome-python.com/" %} {% set social_image_url = "https://awesome-python.com/static/og-image.png" %} {% set meta_title %}{% block title %}{{ default_meta_title }}{% endblock %}{% endset %} {% set meta_description %}{% block description %}{{ default_meta_description }}{% endblock %}{% endset %} + {% set canonical_url %}{% block canonical_url %}{{ default_canonical_url }}{% endblock %}{% endset %} {{ meta_title | trim }} - + + {% block alternate_links %} + {% endblock %} - + diff --git a/website/templates/category.html b/website/templates/category.html new file mode 100644 index 0000000..4415207 --- /dev/null +++ b/website/templates/category.html @@ -0,0 +1,195 @@ +{% extends "base.html" %} +{% block title %}{{ category.name }} Python Libraries | Awesome Python{% endblock %} +{% block description %}Explore {{ entries | length }} curated Python projects in {{ category.name }}. {% if category.description %}{{ category.description }}{% else %}Part of the Awesome Python catalog.{% endif %}{% endblock %} +{% block canonical_url %}{{ category_url }}{% endblock %} +{% block alternate_links %}{% endblock %} +{% block header %} +
+ + + +
+ + +
+

{{ category.name }}

+ {% if category.description %} +

{{ category.description }}

+ {% endif %} +
+
+
+{% endblock %} +{% block content %} +
+
+
+

Projects in {{ category.name }}

+
+

+ Sorted by GitHub stars when available. Click any row for details. +

+
+ +

{{ category.name }} results

+
+ + + + + + + + + + + + + {% for entry in entries %} + + + + + + + + + + + + + + {% endfor %} + +
Row number + + + + + + TagsDetails
+
+ {% if entry.also_see %} +
+ Also see: {% for see in entry.also_see %}{{ see.name }}{% if not loop.last %}, {% endif %}{% endfor %} +
+ {% endif %} +
+ {% if entry.owner %}{{ entry.owner }}/{% endif %}{{ entry.url | replace("https://", "") }} + {% if entry.last_commit_at %}/{% endif %} +
+
+
+
+
+ +
+
+ +

Know a project that belongs here?

+

Tell us what it does and why it stands out.

+ +
+
+{% endblock %} diff --git a/website/templates/index.html b/website/templates/index.html index 53e968d..e2853d0 100644 --- a/website/templates/index.html +++ b/website/templates/index.html @@ -215,7 +215,12 @@ {{ subcat.name }} {% endfor %} {% for cat in entry.categories %} - + {{ cat }} {% endfor %}