Pathway information extracted from 25 years of pathway figures. Hanspers K, Riutta A, Summer-Kutmon M, Pico AR. Genome Biol. 2020 Nov 9;21(1):273. PMID: 33168034[back to paper list]
Identifying genes in published pathway figure images. Riutta A, Hanspers K, Pico AR. bioRxiv preprint 2018; doi: https://doi.org/10.1101/379446
Web app for filtering, searching, and viewing the ~65K pathway figures:
https://gladstone-bioinformatics.shinyapps.io/shiny-25years
See also
https://gladstone-bioinformatics.shinyapps.io/shiny-covidpathways/
Bulk downloads of the pathway figures and OCR results are available on figshare.
NLP – natural language processing
...switch over to Anders' slides from 3/5/2021...
(later should be available from https://wiki.library.ucsf.edu/display/NLPBiomed/NLP@UCSF+Meetups)
The lexicon includes four types of human gene symbols
mapped to NCBI Gene identifiers, from two sources:
Conflicts were resolved with priority order: HGNC symbol > bioentities > HGNC alias > HGNC previous. For example, if the same symbol from HGNC symbol and HGNC alias mapped to different NCBI Gene IDs, then only the HGNC symbol mapping was included in the lexicon. After curated optimization, the lexicon maps 58,242 unique symbols to 19,176 unique IDs.
Pathway-associated keywords: