Linq?

Ask your PowerShell-related questions, including questions on cmdlet development!
Forum rules
Do not post any licensing information in this forum.

Any code longer than three lines should be added as code using the 'Select Code' dropdown menu or attached as a file.
This topic is 7 years and 9 months old and has exceeded the time allowed for comments. Please begin a new topic or use the search feature to find a similar but newer topic.
Locked
User avatar
LtMandella
Posts: 61
Last visit: Mon May 07, 2018 4:03 pm

Linq?

Post by LtMandella »

Being a TSql coder for 20 years, I am clueless about using linq.

However, now I have a powershell studio GUI app where I need to group and summarize data in a datatable (not in sql table).

Do I have any options aside from LINQ?

If it has to be LINQ does anyone have an example of what a LINQ query against a datatable would look like that would group by multiple columns, and perform some aggregates on other columns?

thanks,
ken
jvierra
Posts: 15439
Last visit: Tue Nov 21, 2023 6:37 pm
Answers: 30
Has voted: 4 times
Been upvoted: 33 times

Re: Linq?

Post by jvierra »

A datatable has a dataview which allows sorting and grouping. There is no fundamental difference between a sql datatable and a datatable in Powershell'


Linq is a compiler married component that would be difficult to use in PowerShell.
jvierra
Posts: 15439
Last visit: Tue Nov 21, 2023 6:37 pm
Answers: 30
Has voted: 4 times
Been upvoted: 33 times

Re: Linq?

Post by jvierra »

Note that we can just use plain PowerShell:

$datatable | sort LastName

or

$datatable | Group-Object Lastname,firstname
User avatar
LtMandella
Posts: 61
Last visit: Mon May 07, 2018 4:03 pm

Re: Linq?

Post by LtMandella »

thank you!

I will experiment with group-object and the datatable dataview.

In desperation I am bulk inserting the data into sql server and then reimporting back into datatable using sql grouping and aggregates.

I will have some relatively large rowsets in the datatable ( > 5K records ) so performance will also be a consideration.
jvierra
Posts: 15439
Last visit: Tue Nov 21, 2023 6:37 pm
Answers: 30
Has voted: 4 times
Been upvoted: 33 times

Re: Linq?

Post by jvierra »

Sorting and grouping, if complex, would be faster in SQL due to indexes.

I can sort and group 100000+ records in PowerShell in less than a minute. Of course in SQLServer it can be done in about 5 seconds.
User avatar
LtMandella
Posts: 61
Last visit: Mon May 07, 2018 4:03 pm

Re: Linq?

Post by LtMandella »

not complex, but a good amount of columns and rows...

This works (copied pretty much verbatim from internet) but it did take about a minute for 10K recs, and that won't work of course for interactive GUI. I expect the powershell could be optimized, but I may just use the sql version since I doubt the powershell version will ever be fast enough. And even 10K rows across the wire to sql server and back is just a couple secs...

($Cloned is the existing datatable that contains records to be grouped and aggregated...)

$gdata = $Cloned | Group-Object -Property client_code, matter_code, employee_code, employee_name, part_cat_code, empl_uno

[System.Guid]$metaBatchGUID = [System.Guid]::NewGuid()

$test = @()
$test += foreach ($item in $gdata)
{
$item.Group | Select -Unique @{ Name = 'metaBatchGUID'; Expression = { $metaBatchGUID } },
client_code, matter_code,
@{ Name = 'ROW_UNO'; Expression = { (($item.Group) | measure -Property ROW_UNO -Maximum).Maximum } },
rec_type, employee_name, employee_code, client_uno, part_cat_code, eff_date, to_date, empl_uno,
@{ Name = 'Percentage'; Expression = { (($item.Group) | measure -Property percentage -sum).Sum } },
bill_employee_code, last_modified, matter_uno, status_code, metaToUpdate, new_row_uno,
metaActivityUserName,
metaActivityDateTime, metaDBActivityType, metaAppActivityType, metaValidationResult,
metaDBUpdateResult, metaRowSetType, metaComment

}


thanks again for your help!
jvierra
Posts: 15439
Last visit: Tue Nov 21, 2023 6:37 pm
Answers: 30
Has voted: 4 times
Been upvoted: 33 times

Re: Linq?

Post by jvierra »

I think you need to do some more homework on PowerShell.

Grouping by everything seems foolish. Why?
User avatar
LtMandella
Posts: 61
Last visit: Mon May 07, 2018 4:03 pm

Re: Linq?

Post by LtMandella »

The rows in the datatable need to be de-duped and aggregation performed on any groups containing > 1 row.

The natural key is a composite of a number of those columns. But not all of the columns are keys, so I could "max" or "min" a few more of them.

But would that make a 20x to 30X performance improvement to be competitive with the sql approach?

Possible I guess but not likely.
jvierra
Posts: 15439
Last visit: Tue Nov 21, 2023 6:37 pm
Answers: 30
Has voted: 4 times
Been upvoted: 33 times

Re: Linq?

Post by jvierra »

$deduped=$datatable | select * -unique
User avatar
LtMandella
Posts: 61
Last visit: Mon May 07, 2018 4:03 pm

Re: Linq?

Post by LtMandella »

thank you!

that might be a great solution, I will see if I can adapt it.

But there are some columns in the dataset that are not duplicated (arbitrary unique system row numbers) and have to ignored (so you don't see them in the example but that is why I didn't use "*").

I also have to calculate an aggregate on one of the columns for all records in sets of duplicates.

I know this stuff is hard to visualize without example data. But it really is typical database grouping and aggregating. The powershell provides the correct results, but for greater than about 1000 records is painfully slow.


That is why I thought of first using LINQ, so I could perform the grouping and aggregations without RBAR loop.
This topic is 7 years and 9 months old and has exceeded the time allowed for comments. Please begin a new topic or use the search feature to find a similar but newer topic.
Locked