Commit 11af833d authored by Jeremie Hallal's avatar Jeremie Hallal
Browse files

Merge branch 'naturat_columns_sort' into 'master'

natural column sort

See merge request !183
parents 01a37c7f 1172fbe0
Pipeline #54282 passed with stages
in 18 minutes and 14 seconds
......@@ -7,10 +7,8 @@ Apache-2.0
========================================================================
The following software have components provided under the terms of this license:
- Pygments (from http://pygments.org/)
- aiohttp (from https://github.com/aio-libs/aiohttp/)
- async-timeout (from https://github.com/aio-libs/async_timeout/)
- bleach (from http://github.com/mozilla/bleach)
- boto3 (from https://github.com/boto/boto3)
- botocore (from https://github.com/boto/botocore)
- coverage (from https://coverage.readthedocs.io)
......@@ -43,9 +41,7 @@ The following software have components provided under the terms of this license:
- pytest-dependency (from https://github.com/RKrahl/pytest-dependency)
- python-dateutil (from https://dateutil.readthedocs.org)
- python-multipart (from http://github.com/andrew-d/python-multipart)
- readme-renderer (from https://github.com/pypa/readme_renderer)
- requests (from http://python-requests.org)
- requests-toolbelt (from https://toolbelt.readthedocs.org)
- rfc3986 (from https://rfc3986.readthedocs.org)
- rsa (from https://stuvel.eu/rsa)
- s3transfer (from https://github.com/boto/s3transfer)
......@@ -61,9 +57,7 @@ BSD-2-Clause
========================================================================
The following software have components provided under the terms of this license:
- Pygments (from http://pygments.org/)
- colorama (from https://github.com/tartley/colorama)
- docutils (from http://docutils.sourceforge.net/)
- grpcio (from https://grpc.io)
- locket (from http://github.com/mwilliamson/locket.py)
- mock (from https://github.com/testing-cabal/mock)
......@@ -81,8 +75,6 @@ BSD-3-Clause
The following software have components provided under the terms of this license:
- HeapDict (from http://stutzbachenterprises.com/)
- Pygments (from http://pygments.org/)
- SecretStorage (from https://github.com/mitya57/secretstorage)
- adlfs (from https://github.com/hayesgb/adlfs/)
- asgiref (from http://github.com/django/asgiref/)
- click (from http://github.com/mitsuhiko/click)
......@@ -92,7 +84,6 @@ The following software have components provided under the terms of this license:
- dask (from http://github.com/dask/dask/)
- decorator (from https://github.com/micheles/decorator)
- distributed (from https://distributed.readthedocs.io/en/latest/)
- docutils (from http://docutils.sourceforge.net/)
- fsspec (from http://github.com/intake/filesystem_spec)
- gcsfs (from https://github.com/dask/gcsfs)
- grpcio (from https://grpc.io)
......@@ -124,7 +115,6 @@ The following software have components provided under the terms of this license:
- tblib (from https://github.com/ionelmc/python-tblib)
- toolz (from http://github.com/pytoolz/toolz/)
- uvicorn (from https://github.com/tomchristie/uvicorn)
- webencodings (from https://github.com/SimonSapin/python-webencodings)
- zict (from http://github.com/dask/zict/)
========================================================================
......@@ -149,13 +139,6 @@ The following software have components provided under the terms of this license:
- numpy (from http://www.numpy.org)
========================================================================
CNRI-Python
========================================================================
The following software have components provided under the terms of this license:
- webencodings (from https://github.com/SimonSapin/python-webencodings)
========================================================================
GPL-2.0-only
========================================================================
......@@ -177,18 +160,10 @@ GPL-3.0-only
The following software have components provided under the terms of this license:
- coverage (from https://coverage.readthedocs.io)
- docutils (from http://docutils.sourceforge.net/)
- grpcio (from https://grpc.io)
- pyparsing (from http://pyparsing.wikispaces.com/)
- rfc3986 (from https://rfc3986.readthedocs.org)
========================================================================
GPL-3.0-or-later
========================================================================
The following software have components provided under the terms of this license:
- docutils (from http://docutils.sourceforge.net/)
========================================================================
ISC
========================================================================
......@@ -246,6 +221,7 @@ The following software have components provided under the terms of this license:
- aioredis (from https://github.com/aio-libs/aioredis)
- anyio (from )
- asgiref (from http://github.com/django/asgiref/)
- atomicwrites (from https://github.com/untitaker/python-atomicwrites)
- attrs (from https://attrs.readthedocs.io/)
- azure-common (from https://github.com/Azure/azure-sdk-for-python)
- azure-core (from https://github.com/Azure/azure-sdk-for-python/tree/master/sdk/core/azure-core)
......@@ -266,17 +242,15 @@ The following software have components provided under the terms of this license:
- grpcio (from https://grpc.io)
- h11 (from https://github.com/python-hyper/h11)
- iniconfig (from http://github.com/RonnyPfannschmidt/iniconfig)
- jeepney (from https://gitlab.com/takluyver/jeepney)
- jmespath (from https://github.com/jmespath/jmespath.py)
- jsonschema (from http://github.com/Julian/jsonschema)
- keyring (from https://github.com/jaraco/keyring)
- msal (from https://github.com/AzureAD/microsoft-authentication-library-for-python)
- msal-extensions (from https://pypi.org/project/msal-extensions/0.1.3/)
- msrest (from https://github.com/Azure/msrest-for-python)
- munch (from http://github.com/Infinidat/munch)
- natsort (from https://github.com/SethMMorton/natsort)
- numpy (from http://www.numpy.org)
- pandas (from http://pandas.pydata.org)
- pkginfo (from https://code.launchpad.net/~tseaver/pkginfo/trunk)
- pluggy (from https://github.com/pytest-dev/pluggy)
- py (from http://pylib.readthedocs.org/)
- pyarrow (from https://arrow.apache.org/)
......@@ -295,7 +269,6 @@ The following software have components provided under the terms of this license:
- sniffio (from https://github.com/python-trio/sniffio)
- structlog (from http://www.structlog.org/)
- toml (from https://github.com/uiri/toml)
- tqdm (from https://github.com/tqdm/tqdm)
- urllib3 (from https://urllib3.readthedocs.io/)
- xmltodict (from https://github.com/martinblech/xmltodict)
- zipp (from https://github.com/jaraco/zipp)
......@@ -307,7 +280,6 @@ The following software have components provided under the terms of this license:
- certifi (from http://certifi.io/)
- charset-normalizer (from https://github.com/ousret/charset_normalizer)
- tqdm (from https://github.com/tqdm/tqdm)
========================================================================
NCSA
......@@ -338,10 +310,8 @@ The following software have components provided under the terms of this license:
- async-timeout (from https://github.com/aio-libs/async_timeout/)
- coverage (from https://coverage.readthedocs.io)
- distributed (from https://distributed.readthedocs.io/en/latest/)
- docutils (from http://docutils.sourceforge.net/)
- google-auth (from https://github.com/GoogleCloudPlatform/google-auth-library-python)
- google-auth-oauthlib (from https://github.com/GoogleCloudPlatform/google-auth-library-python-oauthlib)
- keyring (from https://github.com/jaraco/keyring)
- numpy (from http://www.numpy.org)
- pandas (from http://pandas.pydata.org)
- ply (from http://www.dabeaz.com/ply/)
......@@ -350,7 +320,6 @@ The following software have components provided under the terms of this license:
- pytz (from http://pythonhosted.org/pytz)
- rsa (from https://stuvel.eu/rsa)
- sniffio (from https://github.com/python-trio/sniffio)
- tqdm (from https://github.com/tqdm/tqdm)
- typing-extensions (from https://github.com/python/typing)
- urllib3 (from https://urllib3.readthedocs.io/)
......@@ -402,9 +371,7 @@ public-domain
========================================================================
The following software have components provided under the terms of this license:
- Pygments (from http://pygments.org/)
- botocore (from https://github.com/boto/botocore)
- docutils (from http://docutils.sourceforge.net/)
- grpcio (from https://grpc.io)
- numpy (from http://www.numpy.org)
- pandas (from http://pandas.pydata.org)
......
......@@ -86,7 +86,7 @@ class DataframeSerializerSync:
return pd.read_parquet(data)
@classmethod
def read_json(cls, data, orient: Union[str, JSONOrient]) -> 'DataframeSerializerAsync.DataframeClass':
def read_json(cls, data, orient: Union[str, JSONOrient], convert_axes: Optional[bool] = None) -> 'DataframeSerializerAsync.DataframeClass':
"""
:param data: bytes str content (valid JSON str), path object or file-like object
:param orient:
......@@ -94,7 +94,7 @@ class DataframeSerializerSync:
"""
orient = JSONOrient.get(orient)
return pd.read_json(path_or_buf=data, orient=orient.value).replace("NaN", np.NaN)
return pd.read_json(path_or_buf=data, orient=orient.value, convert_axes=convert_axes).replace("NaN", np.NaN)
class DataframeSerializerAsync:
......@@ -117,7 +117,7 @@ class DataframeSerializerAsync:
)
@with_trace("Parquet JSON deserialization")
async def read_json(self, data, orient: Union[str, JSONOrient]) -> DataframeClass:
async def read_json(self, data, orient: Union[str, JSONOrient], convert_axes: Optional[bool] = None) -> DataframeClass:
return await asyncio.get_event_loop().run_in_executor(
self.executor, DataframeSerializerSync.read_json, data, orient
self.executor, DataframeSerializerSync.read_json, data, orient, convert_axes
)
......@@ -41,6 +41,7 @@ from app.routers.sessions import (SessionInternal, UpdateSessionState, UpdateSes
WithSessionStorages, get_session_dependencies)
from app.routers.record_utils import fetch_record
from app.helper.traces import with_trace
from natsort import natsorted
router_bulk = APIRouter() # router dedicated to bulk APIs
......@@ -72,7 +73,7 @@ async def get_df_from_request(request: Request, orient: Optional[str] = None) ->
if MimeTypes.JSON.match(ct):
content = await request.body() # request.stream()
try:
return await DataframeSerializerAsync().read_json(content, orient)
return await DataframeSerializerAsync().read_json(content, orient, convert_axes=False)
except ValueError:
raise HTTPException(status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
detail='invalid body') # TODO
......@@ -139,7 +140,7 @@ class DataFrameRender:
if params.curves:
selection = list(map(str.strip, params.curves.split(',')))
columns = DataFrameRender.get_matching_column(selection, set(df))
df = df[sorted(columns)]
df = df[columns]
if params.offset:
head_index = df.head(params.offset, npartitions=-1, compute=False).index
......@@ -148,6 +149,8 @@ class DataFrameRender:
if params.limit and params.limit > 0:
df = df.head(params.limit, npartitions=-1, compute=False)
df = df[natsorted(df.columns)]
return df
@staticmethod
......
......@@ -75,7 +75,7 @@ def _create_df_from_response(response):
elif content_type == 'text/csv; charset=utf-8':
return pd.read_csv(f, index_col=0)
elif content_type == 'application/json':
return pd.read_json(f, dtype=True, orient='split')
return pd.read_json(f, dtype=True, orient='split', convert_axes=False)
else:
raise ValueError(f"Unknown content-type: '{content_type}'")
......@@ -189,7 +189,7 @@ def setup_client(init_fixtures):
['float_MD', 'float_X'],
['str_MD', 'str_X'],
['date_MD', 'date_X'],
['MD', 'float_X', 'str_X', 'date_X']
['MD', 'date_X', 'float_X', 'str_X']
])
def test_send_all_data_once(setup_client,
entity_type,
......@@ -243,7 +243,7 @@ def test_send_all_data_once(setup_client,
['float_MD', 'float_X'],
['str_MD', 'str_X'],
['date_MD', 'date_X'],
['MD', 'float_X', 'str_X', 'date_X']
['MD', 'date_X', 'float_X', 'str_X']
])
def test_send_all_data_once_post_data_v2_get_data_v3(setup_client,
entity_type,
......@@ -722,6 +722,32 @@ def test_session_chunk_int(setup_client, entity_type, content_type_header, creat
headers=headers)
assert chunk_response_1.status_code == expected_code
@pytest.mark.parametrize("data_format", ['parquet', 'json'])
@pytest.mark.parametrize("accept_content", ['application/x-parquet', 'application/json'])
@pytest.mark.parametrize("columns_name", [
list(map(str, range(100))),
list(map(lambda x: f'test_{x}', range(100))),
list(map(lambda x: f'{x}_test_{x%10}', range(100)))
])
def test_nat_sort_columns(setup_client, data_format, accept_content, columns_name):
""" Create session, append chunking with consecutive index, validate session """
entity_type = 'WellLog'
client, _ = setup_client
record_id = _create_record(client, entity_type)
chunking_url = Definitions[entity_type]['chunking_url']
_create_chunks(client, entity_type, record_id=record_id, data_format=data_format,
cols_ranges=[(columns_name, range(20))])
data_response = client.get(f'{chunking_url}/{record_id}/data', headers={'accept': accept_content})
assert data_response.status_code == 200
response_df = _create_df_from_response(data_response)
assert list(response_df.columns) == columns_name
# todo:
# - concurrent sessions using fromVersion in Integrations tests
# - index: check if dataframe has an index
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment