pdmongo¶

pdmongo.read_mongo(collection: str, query: List[Dict[str, Any]], db: Union[str, pymongo.database.Database], index_col: Union[str, List[str], None] = None, extra: Optional[Dict[str, Any]] = None, chunksize: Optional[int] = None) → pandas.core.frame.DataFrame[source]¶

Read MongoDB query into a DataFrame.

Returns a DataFrame corresponding to the result set of the query. Optionally provide an index_col parameter to use one of the columns as the index, otherwise default integer index will be used.

Parameters:

collection (str) – Mongo collection to select for querying
query (list) – Must be an aggregate query. The input will be passed to pymongo .aggregate
db (pymongo.database.Database or database string URI) – The database to use
index_col (str or list of str, optional, default: None) – Column(s) to set as index(MultiIndex).
extra (dict, optional, default: None) – List of parameters to pass to aggregate method.
chunksize (int, default None) – If specified, return an iterator where chunksize is the number of docs to include in each chunk.

Returns:

Dataframe

pdmongo.to_mongo(frame: pandas.core.frame.DataFrame, name: str, db: Union[str, pymongo.database.Database], if_exists: Optional[str] = 'fail', index: Optional[bool] = True, index_label: Union[str, Sequence[str], None] = None, chunksize: Optional[int] = None) → Union[List[pymongo.results.InsertManyResult], pymongo.results.InsertManyResult][source]¶

Write records stored in a DataFrame to a MongoDB collection.

Parameters:

frame (DataFrame, Series)
name (str) – Name of collection.
db (pymongo.database.Database or database string URI) – The database to write to
if_exists ({‘fail’, ‘replace’, ‘append’}, default ‘fail’) –
- fail: If table exists, do nothing.
- replace: If table exists, drop it, recreate it, and insert data.
- append: If table exists, insert data. Create if does not exist.
index (boolean, default True) – Write DataFrame index as a column.
index_label (str or sequence, optional) – Column label for index column(s). If None is given (default) and index is True, then the index names are used. A sequence should be given if the DataFrame uses MultiIndex.
chunksize (int, optional) – Specify the number of rows in each batch to be written at a time. By default, all rows will be written at once.

pdmongo¶

Table Of Contents

This Page

Navigation

pdmongo¶

Navigation