logo
down
shadow

Dataframe Timestamp Filter for new/repeating value


Dataframe Timestamp Filter for new/repeating value

By : user3100511
Date : January 12 2021, 07:00 PM
it should still fix some issue I have data which ahs been converted into a two column dataframe. There is a Local Date_Local Time column, and a Close column. The data is stock/index data. , Check if this works for you
code :
df['diff']=df.groupby('Day')['Value'].diff().ne(0)
a=((df.reset_index().groupby('diff')['index'].apply(np.array))[0])[:4]
df.drop(a,inplace=True)
df.drop('diff',axis=1,inplace=True)
Day     Time    Value
0   29-Jul-19   22:09   3,020.97
1   29-Jul-19   22:08   3,020.97
2   29-Jul-19   22:07   3,020.97
3   29-Jul-19   22:06   3,020.97
4   29-Jul-19   22:05   3,020.97
5   29-Jul-19   22:04   3,020.98
6   29-Jul-19   22:03   3,020.97
7   29-Jul-19   22:02   3,020.94
8   29-Jul-19   22:01   3,020.89
9   29-Jul-19   22:01   3,020.91
10  29-Jul-19   22:01   3,020.98
11  29-Jul-19   22:01   3,020.98
12  29-Jul-19   22:01   3,020.92
Day     Time    Value
0   29-Jul-19   22:09   3020.97
5   29-Jul-19   22:04   3020.98
6   29-Jul-19   22:03   3020.97
7   29-Jul-19   22:02   3020.94
8   29-Jul-19   22:01   3020.89
9   29-Jul-19   22:01   3020.91
10  29-Jul-19   22:01   3020.98
11  29-Jul-19   22:01   3020.98
12  29-Jul-19   22:01   3020.92


Share : facebook icon twitter icon
Filter rows by timestamp in DataFrame of SparkR

Filter rows by timestamp in DataFrame of SparkR


By : Justin M.
Date : March 29 2020, 07:55 AM
Any of those help Spark 1.6+
You should be able to use unix_timestamp function and standard SQLContext:
code :
ts <- unix_timestamp(df$Timestamp, 'MM/dd/yyyy HH:mm:ss') %>%
  cast("timestamp")

df %>% 
   where(ts <  cast(lit("2015-03-01 00:00:00"), "timestamp"))
sqlContext <- sparkRHive.init(sc)

query <- "SELECT * FROM df
    WHERE unix_timestamp(Timestamp, 'MM/dd/yyyy HH:mm:ss') < 
          unix_timestamp('2015-03-01 00:00:00')" # yyyy-MM-dd HH:mm:ss 

df <- createDataFrame(sqlContext, ...)
registerTempTable(df, 'df')

head(sql(sqlContext, query))

##   ID           Timestamp
## 1  1 08/01/2014 11:18:30
## 2  2 01/01/2015 12:13:45
Filter dataframe according to month from timestamp

Filter dataframe according to month from timestamp


By : riaz ali
Date : March 29 2020, 07:55 AM
To fix this issue I think you need first convert index to DatetimeIndex.month and then check with np.in1d because output of DatetimeIndex.month is numpy array:
code :
#if necessary
#df.index= pd.to_datetime(df.index)

print (type(df.index.month))
<class 'numpy.ndarray'>

df1 = df[np.in1d(df.index.month, oct_may)]
print (df1)
                     max  con    pf
Timestamp                          
2017-03-21 23:00:00  123  232  0.91
2017-03-22 00:00:00  122  232  0.91
2017-03-22 02:00:00  121  232  0.91
2017-03-22 03:00:00  118  232  0.89
2017-03-22 05:00:00  121  232  0.91
2017-03-22 06:00:00  123  232  0.89

df2 = df[np.in1d(df.index.month, jun_sep)]
print (df2)
                     max  con    pf
Timestamp                          
2017-08-22 01:00:00  122  232  0.92
2017-09-22 04:00:00  120  232  0.90
df1 = df[pd.Series(df.index.month, index=df.index).isin(oct_may)]
print (df1)
                     max  con    pf
Timestamp                          
2017-03-21 23:00:00  123  232  0.91
2017-03-22 00:00:00  122  232  0.91
2017-03-22 02:00:00  121  232  0.91
2017-03-22 03:00:00  118  232  0.89
2017-03-22 05:00:00  121  232  0.91
2017-03-22 06:00:00  123  232  0.89
Filter Dataframe based on Timestamp column

Filter Dataframe based on Timestamp column


By : Srini
Date : March 29 2020, 07:55 AM
To fix this issue My requirement is to filter dataframe based on timestamp column such that data which are only 10 minutes old. Dataframe looks like: , Given the dataframe as
code :
+----+------------------+-----+
|ID  |timestamp         |value|
+----+------------------+-----+
|ID-1|8/23/2017 14:48:13|4.56 |
|ID-2|8/23/2017 6:5:21  |5.92 |
|ID-3|8/23/2017 5:49:13 |6.0  |
+----+------------------+-----+ 
2017-08-23 14:53:33
import org.apache.spark.sql.functions._
df.withColumn("timestamp", unix_timestamp($"timestamp", "MM/dd/yyyy HH:mm:ss"))
      .filter((unix_timestamp(current_timestamp()) - $"timestamp")/60 < 10)
    .select($"ID", $"timestamp".cast(TimestampType), $"value")
+----+---------------------+-----+
|ID  |timestamp            |value|
+----+---------------------+-----+
|ID-1|2017-08-23 14:48:13.0|4.56 |
+----+---------------------+-----+
Using another timestamp dataframe to filter a timestamp dataframe on pandas

Using another timestamp dataframe to filter a timestamp dataframe on pandas


By : Tehranix
Date : March 29 2020, 07:55 AM
I hope this helps you . Use between by min and max datetimes of df2:
code :
df3 = df[df['timestamp'].between(df2['timestamp'].min(), df2['timestamp'].max())]
print (df3)
    Id           timestamp
5  285 2017-05-22 11:52:48
R - Filter 1st Dataframe with conditions of timestamp from DF2

R - Filter 1st Dataframe with conditions of timestamp from DF2


By : user2165685
Date : March 29 2020, 07:55 AM
it fixes the issue DF1: , Define a helper function
code :
library(tidyverse)
is_in_interval <- function(x,interval_df){
  (x >= pull(interval_df,1) & x <= pull(interval_df,2)) %>% any()
}
DF1 %>% filter(unlist(map(DateTime, ~ unlist(is_in_interval(.x,DF2)))))
Related Posts Related Posts :
  • Tuning the hyperparameter with gridsearch results in overfitting
  • some coordinates that I extracted from geocoder in Python are not saving in the variable I created
  • 7C in cs circles- python Im not sure what is wrong with this yet
  • How to fix 'AttributeError: 'list' object has no attribute 'shape'' error in python with Tensorflow / Keras when loading
  • python - thread`s target is a method of an object
  • Retrieve Variable From Class
  • What is the reason for matplotlib for printing labels multiple times?
  • Why would people use ThreadPoolExecutor instead of direct function call?
  • When clear_widgets is called, it doesnt remove screens in ScreenManager
  • Python can't import function
  • Pieces doesn't stack after one loop on my connect4
  • How to change font size of all .docx document with python-docx
  • How to store a word with # in .cfg file
  • How to append dictionaries to a dictionary?
  • How can I scrape text within paragraph tag with some other tags then within the paragraph text?
  • Custom entity ruler with SpaCy did not return a match
  • Logging with two handlers - one to file and one to stderr
  • How to do pivot_table in dask with aggfunc 'min'?
  • This for loop displays only the last entry of the student record
  • How to split a string by a specific pattern in number of characters?
  • Python 3: how to scrape research results from a website using CSFR?
  • Setting the scoring parameter of RandomizedSeachCV to r2
  • How to send alert or message from view.py to template?
  • How to add qml ScatterSeries to existing qml defined ChartView?
  • Django + tox: Apps aren't loaded yet
  • My css and images arent showing in django
  • Probability mass function sum 2 dice roll?
  • Cannot call ubuntu 'ulimit' from python subprocess without using shell option
  • Problem with clicking select2 dropdownlist in selenium
  • pandas dataframe masks to write values into new column
  • How to click on item in navigation bar on top of page using selenium python?
  • Add multiple EntityRuler with spaCy (ValueError: 'entity_ruler' already exists in pipeline)
  • error when replacing missing ')' using negative look ahead regex in python
  • Is there a way to remove specific strings from indexes using a for loop?
  • select multiple tags by position in beautifulSoup
  • pytest: getting AttributeError: 'CaptureFixture' object has no attribute 'readouterror' capturing stdout
  • Shipping PyGObject/GTK+ app on Windows with MingW
  • Python script to deduplicate lines in multiple files
  • How to prevent window and widgets in a pyqt5 application from changing size when the visibility of one widget is altered
  • How to draw stacked bar plot from df.groupby('feature')['label'].value_counts()
  • Python subprocess doesn't work without sleep
  • How can I adjust 'the time' in python with module Re
  • Join original np array with resulting np array in a form of dictionary? multidimensional array? etc?
  • Forcing labels on histograms in each individual graph in a figure
  • For an infinite dataset, is the data used in each epoch the same?
  • Is there a more efficent way to extend a string?
  • Is it possible to do this loop in a one-liner?
  • invalid literal for int() with base 10: - django
  • Why does my code print a value that I have not assigned as yet?
  • the collatz func in automate boring stuff with python
  • How to find all possible combinations of parameters and funtions
  • about backpropagation deep neural network in tensorflow
  • Sort strings in pandas
  • How do access my flask app hosted in docker?
  • Replace the sentence include some text with Python regex
  • Counting the most common element in a 2D List in Python
  • logout a user from the system using a function in python
  • mp4 metadata not found but exists
  • Django: QuerySet with ExpressionWrapper
  • Pandas string search in list of dicts
  • shadow
    Privacy Policy - Terms - Contact Us © festivalmusicasacra.org