MLG - Visual Machine Learning arxiv Graph and Textual explorer

MLG is a visual representation of ML researchers and papers from arXiv from the last year (for now). Each node in the graph is an authors and the edges represent co-authorship of papers.

MLG allows you to:

Search for papers or authors.
Filter by topics (NLP, Vision, etc).
Focus on specific author and explore his/her neighbors gradually.

The backend is based on arxiv-sanity but with a lot of modifications - all arXiv data is stored on MongoDB, rebuilt Twitter deamon, etc.

There are two large parts of the code:

/ - arXiv text explorer
/network - arXiv visual graph explorer

Dependencies

$ virtualenv env                # optional: use virtualenv
$ source env/bin/activate       # optional: use virtualenv
$ pip install -r requirements.txt

There is still some legacy code from arxiv-sanity, therefore some of the

Processing pipeline

Install and start MongoDB
Optional - Run fetch_papers.py to collect all paper from arXiv
Create twitter.txt with your Twitter API credentials (values of consumer key and secret, in separate lines).
Run the flask server with serve.py. Visit localhost:5000 and enjoy sane viewing of papers!
Background tasks will to fetch new papers and search for twitter mentions

Generating the network graph

After fetching papers from arXiv you can build the network graph by running the notebook graph_generator.ipynb. It will overwrite the static/network_data.json.

Note: Calculating the physics of the network (nodes' position) is very slow. The current hack is to run it once (by changing the physics settings in network.js) and store the calculated positions. I tried using networkX to calculate the positions, however, the results weren't pleasing...

Running online

If you'd like to run the flask server online (e.g. AWS) run it as python serve.py --prod.

You also want to create a secret_key.txt file and fill it with random text (see top of serve.py).

Name		Name	Last commit message	Last commit date
Latest commit History 264 Commits
static		static
templates		templates
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
all_arxiv_categories.json		all_arxiv_categories.json
analyze.py		analyze.py
arxiv_graph.jpg		arxiv_graph.jpg
buildsvm.py		buildsvm.py
create_authors_db.py		create_authors_db.py
create_index.py		create_index.py
download_pdfs.py		download_pdfs.py
fetch_citations_and_references.py		fetch_citations_and_references.py
fetch_papers.py		fetch_papers.py
graph_generator.ipynb		graph_generator.ipynb
logger.py		logger.py
make_cache.py		make_cache.py
parse_pdf_to_text.py		parse_pdf_to_text.py
relevant_arxiv_categories.json		relevant_arxiv_categories.json
requirements.txt		requirements.txt
restart_server.sh		restart_server.sh
schema.sql		schema.sql
serve.py		serve.py
thumb_pdf.py		thumb_pdf.py
twitter_accounts.txt		twitter_accounts.txt
twitter_daemon.py		twitter_daemon.py
ui.jpeg		ui.jpeg
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MLG - Visual Machine Learning arxiv Graph and Textual explorer

Dependencies

Processing pipeline

Generating the network graph

Running online

About

Uh oh!

Releases

Packages

Languages

License

wfsone/arxiv-network-graph

Folders and files

Latest commit

History

Repository files navigation

MLG - Visual Machine Learning arxiv Graph and Textual explorer

Dependencies

Processing pipeline

Generating the network graph

Running online

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages