Discussion
Loading...

#Tag

Log in
  • About
  • Code of conduct
  • Privacy
  • Users
  • Instances
  • About Bonfire
tante
tante
@tante@tldr.nettime.org  ·  activity timestamp 3 weeks ago

Anyone here knows a tool where you can drop PDFs etc in, have it run things like "named entity recognition" etc. and network the extracted concepts?
I used some open source tool years ago that came from an investigative journalism background but can't seem to find it anymore (and blanking on the name)

Philip Gillißen
Philip Gillißen
@guerda@ruhr.social  ·  activity timestamp 3 weeks ago

@tante #prodigy by #spacy is able to perform named entity recognition ( #NER ) on PDFs. I think the interface is not open source per se but spacy is, which powers the ner.

https://prodi.gy/

  • Copy link
  • Flag this comment
  • Block
Sascha Wolfer
Sascha Wolfer
@sascha_wolfer@fediscience.org  ·  activity timestamp last month

What‘s your go-to #python or #rstats tool(chain) for splitting #German compounds? I‘ve tried a few but was not really satisfied. Maybe I missed something. #NLP #linguistics

Dr. Tim Schatto-Eckrodt
Dr. Tim Schatto-Eckrodt
@Kudusch@social.tchncs.de  ·  activity timestamp last month

@sascha_wolfer Have you looked into Holmes? It’s build on top of #spacy and I remember it being able to extract tokens from compound words: https://github.com/richardpaulhudson/holmes-extractor

GitHub

GitHub - richardpaulhudson/holmes-extractor: Information extraction from English and German texts based on predicate logic

Information extraction from English and German texts based on predicate logic - richardpaulhudson/holmes-extractor
  • Copy link
  • Flag this comment
  • Block

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances
Bonfire social · 1.0.2-alpha.32 no JS en
Federation disabled
Log in
Instance logo
  • Explore
  • About
  • Members
  • Code of Conduct