feat: Add support for Excel and JSON file uploads (fixes #71)#72
Open
ArshVermaGit wants to merge 1 commit into
Open
feat: Add support for Excel and JSON file uploads (fixes #71)#72ArshVermaGit wants to merge 1 commit into
ArshVermaGit wants to merge 1 commit into
Conversation
Refactor data_loader.py to detect file type by extension and dispatch to the appropriate pandas reader (CSV, Excel, JSON). Update the file uploader in app.py to accept all supported formats and add openpyxl dependency for Excel reading.
ArshVermaGit
commented
May 22, 2026
Author
ArshVermaGit
left a comment
There was a problem hiding this comment.
All changes are tested and ready. Please review and merge when you get a chance. Thanks!
Author
|
Hi @Payal-Dhokane ! Issue #71 has been resolved. Please review the PR and merge it under GSSoC. Thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Resolves #71
This PR adds support for Excel (
.xlsx/.xls) and JSON file uploads toDataWhisper, which was previously limited to CSV files only.
Problem
The data loader (
data_loader.py) hardcodedpd.read_csv()as the only file reader,and the Streamlit file uploader in
app.pyrestricted uploads totype=["csv"]. Thisblocked users from uploading Excel or JSON files — two of the most common data formats
in enterprise and API-driven workflows.
Changes Made
src/data_loader.py— Refactored_get_file_extension()to detect file type from bothUploadedFileobjects and string paths_load_csv()(preserves multi-encoding fallback)_load_excel()— reads.xlsx/.xlsviapd.read_excel()withopenpyxlengine_load_json()— reads standard JSON with automatic fallback to JSON Lines (lines=True)_LOADERSdispatch dict for clean extension → loader routingSUPPORTED_EXTENSIONSas single source of truth for accepted file typesapp.py— UpdatedSUPPORTED_EXTENSIONSfromdata_loaderst.file_uploader()to acceptcsv,xlsx,xls,jsonrequirements.txt— Updatedopenpyxl>=3.1.0(required by pandas for Excel file reading)What's NOT Changed
No changes to
eda.py,llm_insights.py,report_generator.py,chat.py, orui_components.py— these modules operate onpd.DataFrameobjects and areformat-agnostic.
Testing
.csv,.xlsx,.xls,.json, and unsupported typesopenpyxlpackage