Skip to content

sshiv012/AFrame

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

100 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PolyFrame: A DataFrame interface for database systems

Python data exploration library that integrates a Pandas-like user experience with various database systems to provide analysts with a familiar environment while scaling out the analytical operations over a large data cluster for Big Data analysis.

Note: This library serves as a proof-of-concept code accompanying our research paper. Not all dataframe functions are/can be implemented and fully tested. Proceed with caution.

Dependencies

  • Python >= 3.3
  • Pip
  • Java >= 1.8

Installation

  • Clone the repository
  • Install from source (pip install . )

Targeted Database Systems

Example usages of PolyFrame can be found under 'notebooks' folder.

Example Notebooks

  • Scale-independent Data Analysis with Database-backed Dataframes: a Case Study. EDBT/ICDT Workshops 2021
    [notebook] [paper]
  • Exploratory Data Analysis with Database-backed Dataframes: A Case Study on Airbnb Data. IEEE BigData 2021: 3119-3129
    [notebook] [paper]

Related Publications

  • PolyFrame: A Retargetable Query-based Approach to Scaling Dataframes. Proc. VLDB Endow. 14(11): 2296-2304 (2021)
    [paper]
  • AFrame: Extending DataFrames for Large-Scale Modern Data Analysis. IEEE BigData 2019: 359-371
    [paper]

About

Data Frame API for AsterixDB

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 51.0%
  • Python 49.0%