Skip to content

Kiwix catalog shows articleCount almost 3X higher than the actual number of ZIM (e.g. Wikipedia) articles #767

@holta

Description

@holta
  1. Any idea why https://library.kiwix.org/catalog/v2/entries?count=-1 has in recent times begun reporting incorrect articleCount values?

  2. Here is one of several examples from the URL above, incorrectly showing 18,579,023 articles for the 7 Million article wikipedia_en_all from August 2025:

      <entry>
        <id>urn:uuid:1ee73243-001b-33e7-1fb7-466f1b4ffec4</id>
        <title>Wikipedia</title>
        <updated>2025-06-17T00:00:00Z</updated>
        <summary>offline version of Wikipedia in English</summary>
        <language>eng</language>
        <name>wikipedia_en_all</name>
        <flavour>mini</flavour>
        <category>wikipedia</category>
        <tags>wikipedia;_category:wikipedia;_pictures:no;_videos:no;_details:no;_ftindex:yes</tags>
        <articleCount>18579023</articleCount>
        <mediaCount>33612</mediaCount>
        <link rel="http://opds-spec.org/image/thumbnail"
              href="/catalog/v2/illustration/1ee73243-001b-33e7-1fb7-466f1b4ffec4/?size=48"
              type="image/png;width=48;height=48;scale=1"/>
        <link type="text/html" href="/content/wikipedia_en_all_mini_2025-06" />
        <author>
          <name>Wikipedia</name>
        </author>
        <publisher>
          <name>openZIM</name>
        </publisher>
        <dc:issued>2025-06-17T00:00:00Z</dc:issued>
        <link rel="http://opds-spec.org/acquisition/open-access" type="application/x-zim" href="https://download.kiwix.org/zim/wikipedia/wikipedia_en_all_mini_2025-06.zim.meta4" length="13823501312" />
      </entry>

    In contrast, browsing wikipedia_en_all_maxi_2025-08.zim shows a more meaningful (and more accurate) number:

    Image
  3. What would be the best way(s) to figure this out?

  4. Should this problem be reported to some other repo?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions