Vinayak Mehta - Extracting tabular data from PDFs with Camelot & Excalibur - PyCon 2019

Скачать Vinayak Mehta - Extracting tabular data from PDFs with Camelot & Excalibur - PyCon 2019 бесплатно в качестве 4к (2к / 1080p)

У нас вы можете скачать бесплатно Vinayak Mehta - Extracting tabular data from PDFs with Camelot & Excalibur - PyCon 2019 или посмотреть видео с ютуба в максимальном доступном качестве.

Для скачивания выберите вариант из формы ниже:

Cкачать музыку Vinayak Mehta - Extracting tabular data from PDFs with Camelot & Excalibur - PyCon 2019 бесплатно в формате MP3:

Если иконки загрузки не отобразились, ПОЖАЛУЙСТА, НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если у вас возникли трудности с загрузкой, пожалуйста, свяжитесь с нами по контактам, указанным в нижней части страницы.
Спасибо за использование сервиса video2dn.com

Описание к видео Vinayak Mehta - Extracting tabular data from PDFs with Camelot & Excalibur - PyCon 2019

"Speaker: Vinayak Mehta

Extracting tables from PDFs is hard. The Portable Document Format was not designed for tabular data. Sadly, a lot of open data is shared as PDFs and getting tables out for analysis is a pain. A simple copy-and-paste from a PDF into a text file or spreadsheet program doesn't work.

This talk will briefly touch upon the history of the Portable Document Format, discuss some problems that arise when extracting tabular data from PDFs using the current ecosystem of libraries and tools and demonstrate how Camelot and Excalibur solve this problem better and in a scalable manner. These easy-to-use packages automatically detect and extract tables from PDFs and give you access to the extracted tables in pandas DataFrames. You can also download them as CSVs or Excel files.

Slides can be found at: https://speakerdeck.com/pycon2019 and https://github.com/PyCon/2019-slides"

Комментарии

Информация по комментариям в разработке