logo
down
shadow

How to combine multiple rows in pandas with shared column values


How to combine multiple rows in pandas with shared column values

By : user3099775
Date : January 11 2021, 03:34 PM
Hope that helps I am using pandas to read json objects and output the data as csv files. , I believe what you are looking for is a groupby:
code :
import numpy as np
import pandas as pd
data = pd.DataFrame([[1,5,10,np.nan,5,np.nan],[2,1,10,np.nan,5,np.nan],[1,5,np.nan,1,np.nan,-10],[2,1,np.nan,1,np.nan,-10]],columns=['u1','u2','c1','c2','c3','c4'])
df = data.groupby(['u1','u2'])['c1','c2','c3','c4'].sum().reset_index()


Share : facebook icon twitter icon
How To Combine Multiple Rows Into One Based on Shared Value in Pandas

How To Combine Multiple Rows Into One Based on Shared Value in Pandas


By : Leon Owade
Date : March 29 2020, 07:55 AM
wish helps you I have a dataframe that generically looks like this: , Original DataFrame:
code :
In [150]: df
Out[150]: 
  Country  Education    GDP
0     USA          5  45000
1     USA          3  68000
2  Canada          7  34000
3  Canada          9  46000
In [151]: df1 = df.groupby('Country').nth(0).reset_index()

In [152]: df1
Out[152]: 
  Country  Education    GDP
0  Canada          7  34000
1     USA          5  45000

In [153]: df2 = df.groupby('Country').nth(1).reset_index()

In [154]: df2
Out[154]: 
  Country  Education    GDP
0  Canada          9  46000
1     USA          3  68000
In [155]: pd.concat([df1, df2.drop('Country', 1)], axis=1)
Out[155]: 
  Country  Education    GDP  Education    GDP
0  Canada          7  34000          9  46000
1     USA          5  45000          3  68000
In [165]: df3 = pd.concat([df1, df2.drop('Country', 1)], axis=1)

In [166]: df3 = df3[['Country', 'Education', 'GDP']]

In [167]: df3
Out[167]: 
  Country  Education  Education    GDP    GDP
0  Canada          7          9  34000  46000
1     USA          5          3  45000  68000
Combine two rows in pandas data frame having same values in multiple columns and comparing data in another column

Combine two rows in pandas data frame having same values in multiple columns and comparing data in another column


By : Ilham Danu Saputra
Date : March 29 2020, 07:55 AM
wish helps you I am very new to python pandas. I have a sorted pandas data frame with 10k+ rows. Here is the sample data frame: , 1) Some useful imports
code :
import pandas as pd
import numpy as np
import datetime as dt
import itertools
import re
df = pd.read_csv("data.csv", sep="|", header=None, names=["time", "mseid", "name", "uec", "mid", "cid"])
df["time"] = [dt.datetime.strptime(":".join(re.findall(r'\d+', time_string)), "%H:%M:%S") for time_string in df["time"]]
df["mseid"] = [mseid.split(":")[-1] for mseid in df["mseid"]]
df["name"] = [name.split(":")[-1] for name in df["name"]]
df["uec"] = [uec.split(":")[-1] for uec in df["uec"]]
df["mid"] = [mid.split(":")[-1] for mid in df["mid"]]
df["cid"] = [cid.split(":")[-1] for cid in df["cid"]]
df_sorted = df.sort_values(["name", "time"]).groupby("name").groups.values()
>>> dict_values([Int64Index([10, 12, 6, 7, 8, 4, 0, 9, 11], dtype='int64'), Int64Index([13, 2, 5, 1, 3], dtype='int64')])

# https://stackoverflow.com/questions/12355442/converting-a-list-of-tuples-into-a-simple-flat-list
ordered = list(itertools.chain(*zip(*df_sorted)))
num_groups = int(len(ordered) / 2)
ordered += [ind for ind in df.index if ind not in ordered]
ordered
>>> [10, 13, 12, 2, 6, 5, 7, 1, 8, 3, 0, 4, 9, 11]


df = df.iloc[ordered]
df = df.reset_index()
del df['index']
df.head()

>>>     time    mseid   name    uec mid cid
0   1900-01-01 12:30:24 459 I_SECONDROW 10  93  20337
1   1900-01-01 12:30:26 500 M_FIRSTROWW 1   80  20110
2   1900-01-01 12:30:24 500 I_SECONDROW 1   80  20110
3   1900-01-01 12:30:31 459 M_FIRSTROWW 10  93  20337
4   1900-01-01 12:30:26 459 I_SECONDROW 10  93  203377
groups = [val for val in range(num_groups) for _ in [0, 1]]
remainder = len(df.index) - len(groups)
groups = groups + ["-" for i in range(remainder)]
df["pair"] = groups
groups

>>> [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, '-', '-', '-', '-']
pairs = df.groupby("pair")["time"]
time_delta = []
for pair in pairs:
    if len(pair[1]) == 2:
        second, first = pair[1].values
        time_difference = abs(int((first - second)/1000000000)) # nanoseconds to seconds
        time_delta.append(time_difference)
time_delta = [val for val in time_delta for _ in [0, 1]]
remainder = len(df.index) - len(time_delta)
time_delta = time_delta + [np.NaN for i in range(remainder)]
df["time_delta"] = time_delta
df

>>>     time    mseid   name    uec mid cid pair    time_delta
0   1900-01-01 12:30:24 459 I_SECONDROW 10  93  20337   0   2.0
1   1900-01-01 12:30:26 500 M_FIRSTROWW 1   80  20110   0   2.0
2   1900-01-01 12:30:24 500 I_SECONDROW 1   80  20110   1   7.0
3   1900-01-01 12:30:31 459 M_FIRSTROWW 10  93  20337   1   7.0
4   1900-01-01 12:30:26 459 I_SECONDROW 10  93  20337   2   116.0
df[df.time_delta <=5].head().groupby("pair").head()

    time    mseid   name    uec mid cid pair    time_delta
0   1900-01-01 12:30:24 459 I_SECONDROW 10  93  20337   0   2.0
1   1900-01-01 12:30:26 500 M_FIRSTROWW 1   80  20110   0   2.0
Take rows that share a value in one column and combine values from another column in pandas dataframe

Take rows that share a value in one column and combine values from another column in pandas dataframe


By : terminalnode
Date : March 29 2020, 07:55 AM
seems to work fine I have a pandas dataframe with multiple rows that can share an ID. Each row also has a value for the "label" column. What I would like is to combine all the labels that share the same ID. , You need
code :
df.groupby('id').label.apply(list).reset_index()

id       label 
1       [a, b]
2    [a, c, d]
3          [e]
How to replace certain rows by shared column values in pandas DataFrame?

How to replace certain rows by shared column values in pandas DataFrame?


By : Rambo Wu
Date : March 29 2020, 07:55 AM
hope this fix your issue Let's say I have the following pandas DataFrame: , try this,
code :
d= df[df['Age']!='#'].set_index('Name')['Age']
df['Age']=df['Name'].replace(d)
     Name Age
0    Alex  10
1     Bob  12
2  Clarke  13
3     Bob  12
4     Bob  12
5     Bob  12
6  Clarke  13
How do I combine data accurately using python pandas between columns with shared column values?

How do I combine data accurately using python pandas between columns with shared column values?


By : user3612148
Date : March 29 2020, 07:55 AM
will help you Combination of INDEX and MATCH formulas will work in MS Excel
For examples to get the AGE values in your desired result you conceptually do this:
code :
=INDEX(X; MATCH(Y;Z))
Related Posts Related Posts :
  • Tuning the hyperparameter with gridsearch results in overfitting
  • some coordinates that I extracted from geocoder in Python are not saving in the variable I created
  • 7C in cs circles- python Im not sure what is wrong with this yet
  • How to fix 'AttributeError: 'list' object has no attribute 'shape'' error in python with Tensorflow / Keras when loading
  • python - thread`s target is a method of an object
  • Retrieve Variable From Class
  • What is the reason for matplotlib for printing labels multiple times?
  • Why would people use ThreadPoolExecutor instead of direct function call?
  • When clear_widgets is called, it doesnt remove screens in ScreenManager
  • Python can't import function
  • Pieces doesn't stack after one loop on my connect4
  • How to change font size of all .docx document with python-docx
  • How to store a word with # in .cfg file
  • How to append dictionaries to a dictionary?
  • How can I scrape text within paragraph tag with some other tags then within the paragraph text?
  • Custom entity ruler with SpaCy did not return a match
  • Logging with two handlers - one to file and one to stderr
  • How to do pivot_table in dask with aggfunc 'min'?
  • This for loop displays only the last entry of the student record
  • How to split a string by a specific pattern in number of characters?
  • Python 3: how to scrape research results from a website using CSFR?
  • Setting the scoring parameter of RandomizedSeachCV to r2
  • How to send alert or message from view.py to template?
  • How to add qml ScatterSeries to existing qml defined ChartView?
  • Django + tox: Apps aren't loaded yet
  • My css and images arent showing in django
  • Probability mass function sum 2 dice roll?
  • Cannot call ubuntu 'ulimit' from python subprocess without using shell option
  • Dataframe Timestamp Filter for new/repeating value
  • Problem with clicking select2 dropdownlist in selenium
  • pandas dataframe masks to write values into new column
  • How to click on item in navigation bar on top of page using selenium python?
  • Add multiple EntityRuler with spaCy (ValueError: 'entity_ruler' already exists in pipeline)
  • error when replacing missing ')' using negative look ahead regex in python
  • Is there a way to remove specific strings from indexes using a for loop?
  • select multiple tags by position in beautifulSoup
  • pytest: getting AttributeError: 'CaptureFixture' object has no attribute 'readouterror' capturing stdout
  • Shipping PyGObject/GTK+ app on Windows with MingW
  • Python script to deduplicate lines in multiple files
  • How to prevent window and widgets in a pyqt5 application from changing size when the visibility of one widget is altered
  • How to draw stacked bar plot from df.groupby('feature')['label'].value_counts()
  • Python subprocess doesn't work without sleep
  • How can I adjust 'the time' in python with module Re
  • Join original np array with resulting np array in a form of dictionary? multidimensional array? etc?
  • Forcing labels on histograms in each individual graph in a figure
  • For an infinite dataset, is the data used in each epoch the same?
  • Is there a more efficent way to extend a string?
  • Is it possible to do this loop in a one-liner?
  • invalid literal for int() with base 10: - django
  • Why does my code print a value that I have not assigned as yet?
  • the collatz func in automate boring stuff with python
  • How to find all possible combinations of parameters and funtions
  • about backpropagation deep neural network in tensorflow
  • Sort strings in pandas
  • How do access my flask app hosted in docker?
  • Replace the sentence include some text with Python regex
  • Counting the most common element in a 2D List in Python
  • logout a user from the system using a function in python
  • mp4 metadata not found but exists
  • Django: QuerySet with ExpressionWrapper
  • shadow
    Privacy Policy - Terms - Contact Us © festivalmusicasacra.org