For example
import polars as pldf=pl.DataFrame({"Col Ind": ['A','B','C','D','E'],"Col A": [1,2,3,4,5],"Col B": [2,4,6,8,10],"Col C": [1,3,5,7,9],"Col D": [5,4,3,2,1] })
I would like to end up with a dataframe with a sixth row giving the total of each numeric column and then a seventh row with the percentage of the sum of those totals for each numeric column
To create the two new rows I have had to come up with the following very convoluted method:
first_col = df.select("Col Ind").to_series().append(pl.Series("temp", ["Total", "Percentage"]))df = df.drop("Col Ind") cols = df.columnsexpr = df.select(pl.sum(cols))rowlist = list(expr.row(0))full = sum(rowlist)pc_row = []for n in range(len(rowlist)): pc_row.append(int(rowlist[n] /full *100))pc_dict = dict(zip(cols, pc_row))pc_df = pl.DataFrame(pc_dict)df = pl.concat([df,expr]) df = pl.concat([df,pc_df])df.insert_column(0, first_col)
This works, but seems an awful lot of steps
IS there a simpler way?
Thankyou