logo
down
shadow

pandas: unexpected result when reindex(ing) a Panel


pandas: unexpected result when reindex(ing) a Panel

By : djangoJC
Date : November 21 2020, 07:35 AM
around this issue This is a bug which was fixed in pandas 0.13.1 (though not in 0.13.0).
Updating resolves the issue.
code :


Share : facebook icon twitter icon
Pandas groupby result shape unexpected

Pandas groupby result shape unexpected


By : user3469926
Date : March 29 2020, 07:55 AM
wish of those help I have a time-series data in "stacked" format and would like to compute a rolling function based on two columns. However, as shown in my example below, the groupby is concatenating my results horizontally instead of vertically. I can apply stack at the end to get back to tall format. However, I thought the correct behavior should be to concatenate vertically to allow assignment back to the original dataframe(something like x['res'] = df.groupby(...).apply(func)). Does anyone know why groupby is not behaving as expected or am I doing something wrong? , You can convert the result to dataframe in func():
code :
def func(s):
    return (pd.rolling_sum(s.a, 3) / pd.rolling_sum(s.b, 3)).dropna().to_frame()

df.groupby('group').apply(func)
filling the missing points in the time series data with pandas.date_range and pandas.reindex python

filling the missing points in the time series data with pandas.date_range and pandas.reindex python


By : London Pub Crawl
Date : March 29 2020, 07:55 AM
hope this fix your issue I think you can omit read file by genfromtxt and try only read_csv, then found min and max dates for reindex method.
Or use resample:
code :
import pandas as pd
import numpy as np
import io

temp=u""""2011-08-26 00:00:00",1155179,3.232,23.7,3.281,0.386,25.27,111.5665,28.92,29.83,19.13,0,111.5,13.02,29.77,345.7
"2011-08-26 00:00:30",1155180,3.289,20.44,2.153,0.222,25.25,111.5735,28.94,29.82,19.53,0,111.5,13.02,29.79,342.4
"2011-08-26 23:59:30",1155297,12.62,28.06,3.162,1.356,24.3,111.4614,28.65,29.84,19.53,0,111.4,13.06,29.50,350.1"""

#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep=",", index_col=[0], parse_dates=[0], header=None)
print df
                          1       2      3      4      5      6         7   \
0                                                                            
2011-08-26 00:00:00  1155179   3.232  23.70  3.281  0.386  25.27  111.5665   
2011-08-26 00:00:30  1155180   3.289  20.44  2.153  0.222  25.25  111.5735   
2011-08-26 23:59:30  1155297  12.620  28.06  3.162  1.356  24.30  111.4614   

                        8      9      10  11     12     13     14     15  
0                                                                         
2011-08-26 00:00:00  28.92  29.83  19.13   0  111.5  13.02  29.77  345.7  
2011-08-26 00:00:30  28.94  29.82  19.53   0  111.5  13.02  29.79  342.4  
2011-08-26 23:59:30  28.65  29.84  19.53   0  111.4  13.06  29.50  350.1  
start = df.index.min()
end = df.index.max()
print start
2011-08-26 00:00:00
print end
2011-08-26 23:59:30

index = pd.date_range(start,end,freq="30S")
sk_f = df.reindex(index)
print sk_f.head()
                          1      2      3      4      5      6         7   \
2011-08-26 00:00:00  1155179  3.232  23.70  3.281  0.386  25.27  111.5665   
2011-08-26 00:00:30  1155180  3.289  20.44  2.153  0.222  25.25  111.5735   
2011-08-26 00:01:00      NaN    NaN    NaN    NaN    NaN    NaN       NaN   
2011-08-26 00:01:30      NaN    NaN    NaN    NaN    NaN    NaN       NaN   
2011-08-26 00:02:00      NaN    NaN    NaN    NaN    NaN    NaN       NaN   

                        8      9      10  11     12     13     14     15  
2011-08-26 00:00:00  28.92  29.83  19.13   0  111.5  13.02  29.77  345.7  
2011-08-26 00:00:30  28.94  29.82  19.53   0  111.5  13.02  29.79  342.4  
2011-08-26 00:01:00    NaN    NaN    NaN NaN    NaN    NaN    NaN    NaN  
2011-08-26 00:01:30    NaN    NaN    NaN NaN    NaN    NaN    NaN    NaN  
2011-08-26 00:02:00    NaN    NaN    NaN NaN    NaN    NaN    NaN    NaN  
print df.resample('30S', fill_method='ffill').head()
                          1      2      3      4      5      6         7   \
0                                                                           
2011-08-26 00:00:00  1155179  3.232  23.70  3.281  0.386  25.27  111.5665   
2011-08-26 00:00:30  1155180  3.289  20.44  2.153  0.222  25.25  111.5735   
2011-08-26 00:01:00  1155180  3.289  20.44  2.153  0.222  25.25  111.5735   
2011-08-26 00:01:30  1155180  3.289  20.44  2.153  0.222  25.25  111.5735   
2011-08-26 00:02:00  1155180  3.289  20.44  2.153  0.222  25.25  111.5735   

                        8      9      10  11     12     13     14     15  
0                                                                         
2011-08-26 00:00:00  28.92  29.83  19.13   0  111.5  13.02  29.77  345.7  
2011-08-26 00:00:30  28.94  29.82  19.53   0  111.5  13.02  29.79  342.4  
2011-08-26 00:01:00  28.94  29.82  19.53   0  111.5  13.02  29.79  342.4  
2011-08-26 00:01:30  28.94  29.82  19.53   0  111.5  13.02  29.79  342.4  
2011-08-26 00:02:00  28.94  29.82  19.53   0  111.5  13.02  29.79  342.4
pandas: reindex panel with dataframe index

pandas: reindex panel with dataframe index


By : Debasis
Date : March 29 2020, 07:55 AM
fixed the issue. Will look into that further The reindexing fails due to the time component so you can access just the date component of your datetimeIndex
code :
pnl2 = pnl.reindex(df.index.date)
Unexpected result in pandas pivot_table

Unexpected result in pandas pivot_table


By : user1438927
Date : March 29 2020, 07:55 AM
like below fixes the issue Use the string 'size' instead. This will trigger the Pandas interpretation of "size", i.e. the number of elements in a group. The NumPy interpretation of size is the product of the lengths of each dimension.
code :
df = pd.pivot_table(df, aggfunc='size', index=["IND"], columns="DATA")

print(df)

DATA   2    3    4    10
IND                     
1     1.0  NaN  1.0  NaN
2     NaN  1.0  NaN  NaN
3     NaN  NaN  1.0  NaN
4     1.0  NaN  NaN  NaN
5     NaN  2.0  NaN  1.0
Pandas reindex and interpolate time series efficiently (reindex drops data)

Pandas reindex and interpolate time series efficiently (reindex drops data)


By : Anıl
Date : March 29 2020, 07:55 AM
With these it helps The only (simple) way I can see of doing this is to use resample to upsample to your time resolution (say 1 second), then reindex.
Get an example DataFrame:
code :
import numpy as np
import pandas as pd

np.random.seed(2)

df = (pd.DataFrame()
 .assign(SampleTime=pd.date_range(start='2018-10-01', end='2018-10-08', freq='30T')
                    + pd.to_timedelta(np.random.randint(-5, 5, size=337), unit='s'),
         Value=np.random.randn(337)
         )
 .set_index(['SampleTime'])
)
df.head()

                        Value
SampleTime
2018-10-01 00:00:03     0.033171
2018-10-01 00:30:03     0.481966
2018-10-01 01:00:01     -0.495496
desired_index = pd.date_range('2018-10-01', periods=10, freq='30T')
(df
 .reindex(df.index.union(desired_index))
 .interpolate(method='time')
 .reindex(desired_index)
)

                        Value
2018-10-01 00:00:00     NaN
2018-10-01 00:30:00     0.481218
2018-10-01 01:00:00     -0.494952
2018-10-01 01:30:00     -0.103270
Related Posts Related Posts :
  • Filtering from data
  • Where is the problem about selenium with python?
  • ansible custom filter fails when importing python library
  • How to assign the label of one column to the new one based on group maximum in pandas
  • What is the best approach for isolating a single area of similar colour?
  • Creating multiple clients for topics
  • Why is my 'for loop', despite iterating over all keys, only acting on the last one?
  • Can someone tell me what's wrong, when I run it the browsers says "This site can’t be reached"
  • Error in setting up mitmproxy on alpine 3.9
  • From traditional loop to list comprehension
  • Django celery unregistered task | relative imports
  • How to add elements in a multi dimensional array
  • Async await with sqs receive messages not working properly
  • What is definition of 'NAME' in Python grammar
  • Easy method to move rows from df to another with coditions?
  • Changing the size of only a single plot in matplotlib, without altering figure parameters
  • Fastest way to use Vision API on 10,000+ images with python
  • How to install nvidia apex on Google Colab
  • Random numbers Continuous in python
  • Fetching data after a certain time interval(10 sec) from a continuously increasing database like mysql using flask
  • Using VLOOKUP with merge in Python
  • Calculate geographical distance between 5 cities with all the possible combinations of each city
  • How to filter a pandas dataframe using multiple partial strings?
  • Pygame- make bullet shoot toward cursor direction
  • Create SEQUENCE based dictionary from list
  • How to fix broken link from Django MEDIA_ROOT?
  • How can I display the current time left in a timer in a label?
  • Compute number of occurance of each value and Sum another column in Pandas
  • How to separate the prefix in words that are 'di'?
  • Handling network errors from an external API across an application
  • Want a pandas Series of Trips Completed to count(Request) ratio for each hour as index for the given dataframe
  • Access dict keys and list elements by same index to loop over and assign values
  • Find rows from the same dataframe based on condition
  • Read only specific part first two lines from text file in python
  • Python How to convert string to dataframe?
  • How to fix this my error code program? I use Python 3.6
  • Is there a way of getting this string down to 3 words?
  • Large difference between overall F Score for a custom Spacy NER model and Individual Entity F Score
  • Drop rows where timestamps are older than subsequent row
  • Implement a bottle spin
  • Unable to convert widows epoch time to normal date time
  • Values from a XML file
  • PyAudio readframes not ending when wav file completes
  • Could not load the module
  • How to change datetime.datetime(2012, 1, 1, 0, 0) to 1/1/2012 in Python?
  • How to create ASN.1 Sequence without NamedType?
  • How to locate specific sequences of words in a sentence efficiently
  • How can I generate a multi-step process in Django without changing pages (w/out a new request)?
  • Why does this list comprehension only "sometimes" work?
  • send html report with row collapsed
  • How to define a type hint to a argument (the argument's value is a class, all expected value is a subclass of a certain
  • How do I send a styled pandas DataFrame by e-mail without losing the format?
  • How to view/average a groupby dataframe when the data is a string?
  • Django 2.2 staticfiles do not work in development
  • Flag to enable/disable numba JIT compilation?
  • Trying to split byte in a byte array into two nibbles
  • Error in Query - missing FROM-clause entry for table - SQL
  • Reading double c structures of Dll with Ctypes in Python
  • Autofill missing row in database based on missing time range
  • Get the max of a nested dictionary
  • shadow
    Privacy Policy - Terms - Contact Us © festivalmusicasacra.org