logo
down
shadow

Is there a quick way to find the global quantile of a pandas dataframe?


Is there a quick way to find the global quantile of a pandas dataframe?

By : user3099359
Date : January 11 2021, 03:32 PM
it fixes the issue You can use numpy's np.quantile [numpy-doc] for that:
code :
>>> import numpy as np
>>> np.quantile(df, 0.2)
10.6
>>> pd.np.quantile(df, 0.2)
10.6


Share : facebook icon twitter icon
Quick way to find all permutations of a pandas DataFrame that preserves a sort?

Quick way to find all permutations of a pandas DataFrame that preserves a sort?


By : user3575333
Date : March 29 2020, 07:55 AM
it fixes the issue Since you're grouping by age, let's do that and return all the permutations for each group and then take the product (using itertools' product and permutation functions):
code :
In [11]: age = df.groupby("age")
In [12]: age.get_group(21)
Out[12]:
   age   name
2   21  Chris
4   21   Evan

In [13]: list(permutations(age.get_group(21).index))
Out[13]: [(2, 4), (4, 2)]

In [14]: [df.loc[list(p)] for p in permutations(age.get_group(21).index)]
Out[14]:
[   age   name
 2   21  Chris
 4   21   Evan,    age   name
 4   21   Evan
 2   21  Chris]
In [21]: [list(permutations(grp.index)) for (name, grp) in age]
Out[21]: [[(1,)], [(2, 4), (4, 2)], [(3,)], [(0,)]]

In [22]: list(product(*[(permutations(grp.index)) for (name, grp) in age]))
Out[22]: [((1,), (2, 4), (3,), (0,)), ((1,), (4, 2), (3,), (0,))]
In [23]: [sum(tups, ()) for tups in product(*[(permutations(grp.index)) for (name, grp) in age])]
Out[23]: [(1, 2, 4, 3, 0), (1, 4, 2, 3, 0)]
In [24]: [df.loc[list(sum(tups, ()))] for tups in product(*[list(permutations(grp.index)) for (name, grp) in age])]
Out[24]:
[   age   name
 1   20    Bob
 2   21  Chris
 4   21   Evan
 3   22  David
 0   28    Abe,    age   name
 1   20    Bob
 4   21   Evan
 2   21  Chris
 3   22  David
 0   28    Abe]
In [25]: [list(df.loc[list(sum(tups, ())), "name"]) for tups in product(*[(permutations(grp.index)) for (name, grp) in age])]
Out[25]:
[['Bob', 'Chris', 'Evan', 'David', 'Abe'],
 ['Bob', 'Evan', 'Chris', 'David', 'Abe']]
quantile normalization on pandas dataframe

quantile normalization on pandas dataframe


By : Sarah Yam
Date : March 29 2020, 07:55 AM
seems to work fine Ok I implemented the method myself of relatively high efficiency.
After finishing, this logic seems kind of easy but, anyway, I decided to post it here for any one feels confused like I was when I couldn't googled the available code.
Rank Pandas dataframe by quantile

Rank Pandas dataframe by quantile


By : Cindy Xu
Date : March 29 2020, 07:55 AM
To fix the issue you can do Method 1 mul & np.ceil
You were quite close with the rank. Just multiplying by 5 with .mul to get the desired quantile, also rounding up with np.ceil:
code :
np.ceil(df.rank(axis=1, pct=True).mul(5))
             AC   BO    C  CCM   CL  CRD   CT   DA   GC   GF
2010-01-19  5.0  2.0  2.0  4.0  1.0  1.0  3.0  4.0  5.0  3.0
2010-01-20  2.0  2.0  5.0  1.0  1.0  3.0  4.0  5.0  3.0  4.0
2010-01-21  5.0  2.0  2.0  4.0  1.0  1.0  3.0  4.0  5.0  3.0
np.ceil(df.rank(axis=1, pct=True).mul(5)).astype(int)
np.ceil(df.rank(axis=1, pct=True).mul(5)).astype('Int64')
            AC  BO  C  CCM  CL  CRD  CT  DA  GC  GF
2010-01-19   5   2  2    4   1    1   3   4   5   3
2010-01-20   2   2  5    1   1    3   4   5   3   4
2010-01-21   5   2  2    4   1    1   3   4   5   3
d = df.apply(lambda x: [np.ceil(stats.percentileofscore(x, a, 'rank')*0.05) for a in x], axis=1).values

pd.DataFrame(data=np.concatenate(d).reshape(d.shape[0], len(d[0])), 
             columns=df.columns, 
             dtype='int', 
             index=df.index)
            AC  BO  C  CCM  CL  CRD  CT  DA  GC  GF
2010-01-19   5   2  2    4   1    1   3   4   5   3
2010-01-20   2   2  5    1   1    3   4   5   3   4
2010-01-21   5   2  2    4   1    1   3   4   5   3
Quick way to find previous instance of a value in a pandas Dataframe or numpy array?

Quick way to find previous instance of a value in a pandas Dataframe or numpy array?


By : user3286220
Date : March 29 2020, 07:55 AM
may help you . I have an large data set (number of rows in millions) which I read into a pandas DataFrame called datafile. , This is faster:
code :
datafile['Prev_Price'] = datafile.groupby('OrderId')['Price'].shift(fill_value=0)
   Price   Qty  OrderId  Prev_Price
0  26690  3000  1213772           0
1  26700  3000  1215673           0
2  26705  6000  1216656           0
3  26700  3000  1213772       26690
4  26710  3000  1215673       26700
How to Use Groupby Quantile with Pandas Dataframe

How to Use Groupby Quantile with Pandas Dataframe


By : user3522291
Date : March 29 2020, 07:55 AM
like below fixes the issue If I understand you correctly, you want GroupBy with pd.qcut to get the quantiles and then take the rows in the highest quantile:
code :
quantiles = (
    df.groupby(['Name', 'Date'])['Value'].apply(lambda x: pd.qcut(x, 4, labels=[0, 0.25, 0.5, 1]))
)

top_quantile_df = df[quantiles.eq(1)]
    Name     Date Item  Quantity  Unit Cost  Value
0   Alex  2018 Q1   AA         9       8.97  80.73
5   Alex  2018 Q2   AA         4       7.00  28.00
8    Ray  2018 Q1   AA         8       5.30  42.40
11   Ray  2018 Q2   DD         4       8.00  32.00
Related Posts Related Posts :
  • (Py)Tesseract failing to read text from simple image
  • Is there a reason why the following code does not execute (print the tweets) after taking input?
  • How can I use multiple bigquery projects together in python
  • Pandas merging rows with the same value and same index with multiple empty values
  • How to replace cell in pandas?
  • Can anybody explain me print statement in this code?
  • What is my problem to use of Conv2D for image?
  • Unable to import Pandas in AWS Lambda
  • Constant output of 2D Convolutional regression in Keras-tensorflow
  • Why can't replace placeholder with format function in pymysql?
  • Issue with div`s in Scraping with Python and Beautiful Soup
  • Can't change instance attribute of my class?
  • Add column to dataframe and merge
  • Librosa Constant Q Transform (CQT) contains defects at the beginning and ending of the spectrogram
  • Indexing 3d numpy array with 2d array
  • Does using win32com library carry over privacy problems?
  • Something is adding upper folders in zip file
  • Python 3 upgrade, a bytes-like object is required, not 'str'
  • How can I make a timer for a command without blocking the program
  • How to include first/last dates in matplotlib plot
  • What is the executable_path in Google Colaboratory for geckodriver?
  • Snakemake producing wildly incoherent error when dryrunning
  • How would I implement an ID to identify classes?
  • it shows"line 42, in <module> if input_ !='no': NameError: name 'input_' is not defined" when i giv no i
  • How get the text with BeautifulSoup in this html code: <span id="pass_0" class="text-success">
  • Trying to save instances in their class
  • Python Removing Words from list even if they match criteria
  • Text Classification with word2vec
  • How to implement rectangular pulses (discontinuities) on ODE right-hand side?
  • unpacking a 4-byte class gives a unpacking error
  • python3 take a callback that may take an argument and may not
  • How to make two iteration in for loop using for-in syntax
  • Finding Middle point of list in Python
  • using a for loop for web scraping - cannot "pass" certain data
  • Generate positive only distribution based on array
  • Why is numpy.random.choice modifying my data?
  • Pandas applymap loops twice, apply once?
  • Removing rows with specific text
  • Get the most repeated value from columns of list other than zero in pandas data frame
  • How to insert text in multiple files using python
  • Python merging excel files in directory
  • How to put the every start time as 0 in every day for specific column input data using panda python
  • Data Frame Error: UndefinedVariableError: name is not defined
  • Why won't a new line be created in this string? is it too long?
  • Python 3 - files imported as dictionary, but the values are lists - how to resolve?
  • Flask Tutorial: Could Not Import app in Visual Studio Code 1.37.1
  • 'TypeError: decoding str is not supported' when appending str in for loop within a for loop
  • How to scale a data using Python 3
  • How to create a matrix of characters with numpy broadcasting, meshgrid or other method
  • Is there any way of getting values from keys inside other keys?
  • Conditional Statements for dataframes
  • Python implementation of BFS to solve 8-puzzle takes too long to find a solution
  • Operand for matching any one of multiple cases
  • Is the rear item in a Queue the last item added or the item at the end of a Queue?
  • I am trying slicing but I have the following error message: slice indices must be integers or None or have an __index__
  • How to represent Binary tree into an array using python?
  • Vectorized implementation of field-aware factorization
  • 'Float' object has no attribute 'log'
  • pathlib mkdir creates a folder by filename
  • SyntaxError: invalid syntax for if statement
  • shadow
    Privacy Policy - Terms - Contact Us © festivalmusicasacra.org