Skip to content

Conversation

@robertatakenaka
Copy link
Member

Resumo

Este PR implementa melhorias significativas no processo de carregamento de Issues a partir do ArticleMeta, adiciona uma nova task para atualização em lote, e corrige problemas de robustez nos modelos.

Mudanças

Core

  • BaseLegacyRecord.get: Melhora tratamento de registros duplicados, removendo órfãos e retornando o mais recente em caso de MultipleObjectsReturned

Issue

Models

  • Issue.short_identification: Gera issue_folder automaticamente se ausente
  • Issue.get: Trata duplicatas e auto-preenche campos ausentes (issue_folder, issue_pid_suffix, order)
  • TableOfContents: Adiciona método get_items_by_title para busca padronizada por título

ArticleMeta Loader

  • Refatora complete_am_issue e get_issue_data_from_am_issue com melhor tratamento de erros e detalhes para debug
  • Extrai lógica de carregamento para funções dedicadas e reutilizáveis:
    • _extract_field: extração genérica com prioridade de fontes
    • load_issue_sections: carrega sections com tratamento de erros
    • load_issue_titles: carrega títulos com tratamento de erros
    • load_bibliographic_strips: carrega strips com tratamento de erros

Issue Utils

  • Corrige tratamento de valores None em sections_data e bibliographic_strip

Tasks

  • Nova task task_update_issues_from_amissue: Atualização em lote de Issues com suporte a múltiplos filtros (collection, journal, status, datas, volume, número, etc.)

Article

Sources (xmlsps)

  • Melhora get_or_create_toc_sections para carregar sections do AMIssue quando necessário
  • Usa TableOfContents.get_items_by_title para busca padronizada

Dependencies

  • Limita django-otp < 1.2.0 para compatibilidade com wagtail_2fa.middleware.VerifyUserPermissionsMiddleware

Commits

  1. Limita versão do django-otp para compatibilidade com wagtail_2fa
  2. Melhora tratamento de registros duplicados em BaseLegacyRecord.get
  3. Corrige tratamento de valores None em extract_data_from_harvested_data
  4. Melhora robustez dos modelos Issue e TableOfContents
  5. Refatora loader de issues do ArticleMeta
  6. Adiciona task para atualização em lote de Issues a partir de AMIssue
  7. Melhora busca de seções TOC em get_or_create_toc_sections

Testes

  • Testar carregamento de issues via ArticleMeta
  • Testar task_update_issues_from_amissue com diferentes filtros
  • Verificar busca de seções TOC em artigos

- Restringe django-otp < 1.2.0
- Resolve problema com VerifyUserPermissionsMiddleware
- Remove registros órfãos (sem url e sem data) antes da consulta
- Trata MultipleObjectsReturned retornando o registro mais recente
- Adiciona fallback para lista vazia em sections_data
- Adiciona fallback para lista vazia em bibliographic_strip
- Restringe django-otp < 1.2.0
- Resolve problema com VerifyUserPermissionsMiddleware
complete_am_issue:
- Adiciona validações e detalhes para debug

get_issue_data_from_am_issue:
- Melhora tratamento de am_issue.data ausente
- Extrai journal_pid do pid ao invés de am_data
- Adiciona detalhes de debug nas exceções

create_issue_from_am_issue:
- Extrai lógica de carregamento para funções dedicadas

Novas funções:
- _extract_field: extração genérica com prioridade de fontes
- load_issue_sections: carrega sections com tratamento de erros
- load_issue_titles: carrega títulos com tratamento de erros
- load_bibliographic_strips: carrega strips com tratamento de erros
task_update_issues_from_amissue:
- Suporta filtros por collection, journal_pid, status, datas
- Suporta filtros por year, volume, number, supplement
- Opção force_update para atualizar Issues existentes
- Opção only_without_new_record para processar apenas pendentes
- Remove AMIssue órfãos antes do processamento
- Retorna estatísticas detalhadas de processamento
- Tratamento individual de erros por registro
Copilot AI review requested due to automatic review settings January 25, 2026 10:21
- Carrega sections do AMIssue se table_of_contents estiver vazio
- Usa TableOfContents.get_items_by_title para busca padronizada
- Corrige import de load_issue_sections para issue_utils
- Move verificação de exists() antes do loop de append
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the robustness of Issue loading and synchronization from ArticleMeta, adds a batch task to update Issues from AMIssue, and tightens handling of duplicate/legacy records.

Changes:

  • Adds stricter legacy record retrieval in BaseLegacyRecord.get and Issue.get, including duplicate resolution and auto-filling of computed fields (issue_folder, issue_pid_suffix, order).
  • Refactors the ArticleMeta Issue loader into reusable helpers (get_issue_data_from_am_issue, load_issue_sections, load_issue_titles, load_bibliographic_strips) and introduces task_update_issues_from_amissue for bulk updates.
  • Enhances TOC section lookup and ArticleMeta-backed section loading in XML SPS article import, and adds a compatibility pin for django-otp to work with wagtail_2fa.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
requirements/base.txt Adds django-otp<1.2.0 constraint to maintain compatibility with wagtail_2fa.
core/models.py Updates BaseLegacyRecord.get to clean orphan records and resolve duplicates by returning the most recently updated record.
issue/models.py Ensures Issue.short_identification and Issue.get auto-generate missing computed fields and adds TableOfContents.get_items_by_title for consistent TOC lookups.
issue/articlemeta/issue_utils.py Makes extract_data_from_harvested_data robust to missing sections_data / bibliographic_strip by defaulting to empty collections.
issue/articlemeta/loader.py Strengthens complete_am_issue and get_issue_data_from_am_issue error reporting and factors out reusable load_issue_* helpers for sections, titles, and bibliographic strips.
issue/tasks.py Introduces task_update_issues_from_amissue to bulk-create/update Issue records from AMIssue with rich filtering and per-item error tracking.
article/sources/xmlsps.py Integrates AMIssue-backed TOC sections into get_or_create_toc_sections and switches to TableOfContents.get_items_by_title for normalized TOC section resolution.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +32 to +33
from issue.models import Issue, TableOfContents, AMIssue
from issue.articlemeta.loader import load_issue_sections
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load_issue_sections is imported from issue.articlemeta.issue_utils, but the function is actually defined in issue.articlemeta.loader; as written this import will raise ImportError at module import time and break any code path that loads article.sources.xmlsps.

To fix this, adjust the import to pull load_issue_sections from the correct module (issue.articlemeta.loader).

Copilot uses AI. Check for mistakes.
Comment on lines +844 to +845
if issue.table_of_contents is None:
raise cls.DoesNotExist("TableOfContents.get_items_by_title requires valid issue")
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if issue.table_of_contents is None: will never trigger because table_of_contents is a reverse related manager on Issue and is never None, even when there are no TOC records; this makes the branch (and its DoesNotExist message about a "valid issue") effectively dead code and may confuse future maintainers.

If you want to guard on the presence of TOC entries, consider checking for emptiness (e.g. if not issue.table_of_contents.exists():) or simply dropping this check and relying on the query/filter itself.

Suggested change
if issue.table_of_contents is None:
raise cls.DoesNotExist("TableOfContents.get_items_by_title requires valid issue")

Copilot uses AI. Check for mistakes.
Comment on lines +1025 to +1026
except Exception:
pass
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
except Exception:
pass
except Exception as exc:
logging.getLogger(__name__).warning(
"Failed to delete legacy records with null url and data in %s.get: %s",
cls.__name__,
exc,
)

Copilot uses AI. Check for mistakes.
# Limpar AMIssue órfãos (sem data e sem URL)
AMIssue.objects.filter(data__isnull=True, url__isnull=True).delete()
except Exception:
pass
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Suggested change
pass
logging.exception("Failed to delete orphan AMIssue records (data__isnull=True, url__isnull=True)")

Copilot uses AI. Check for mistakes.
@robertatakenaka robertatakenaka merged commit 4a48316 into scieloorg:main Jan 25, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant