• SUMIF using 2 conditions (Excel 97)

    Author
    Topic
    #360988

    I have often used SUMIF to sum a column range based on a value in another column of the same range. Now I want to sum a column of numbers based on values in two other columns. Anyone know how to do it?

    Viewing 0 reply threads
    Author
    Replies
    • #544804

      You have a couple of choices:

      You could create a new column that would reflect the combined result of the two columns you want to use, and then use that third column as the criteria range for the sumif. Assuming that you have a logical (true/false) value in column A and in column B, and you want the sum of all values in column C where A and B are equal to “true” you would insert a new column C with the formula (in C1)
      = And(A1 = TRUE, B1 = TRUE).
      This will return TRUE iff A and B are true. The values you wanted to sum are now in Column D, and your sumif would look like:

      =SUMIF(C1:C15,TRUE,D1:D15)
      assuming that the data range extends down 15 rows

      This will work, but I find it a little kludgy – it could be improved by making the second argument a reference: in this way you could change the value in the reference cell from true to false in order to obtain the total of all cells where the values in A and B are not both true.

      I think a better approach is to use an array formula – you can be much more flexible in your criteria, and you don’t have to insert (and possibly hide) un-needed columns in your spreadsheet.
      Array formulas opearate on arguments with multiple values – like lists or ranges of cells. The array formula equivalent to the first approach above would be:

      {=SUM(IF(($A$1:$A$15=TRUE)*($B$1:$B$15=TRUE) =1,$D$1:$D$15,0))}
      note that the bracket before the “=” sign and after the rest of the expression is not typed, but is added by Excel to signify an array formula after it is entered with Ctrl-Alt-Shift

      This formula takes advantage of the fact that logical ‘true’ takes on the arithmetic value of “1” and logical false takes on the value of “0” when used in a calculation.

      The array formula takes the sum of a series of “IF” statements, where the IF statement returns either the value in cell Dn or zero. For each value in the range A1:A15 the logical condition (An = TRUE) will evaluate as 1 if An is TRUE, and as 0 otherwise, and likewise for the values in column B. The logical values tested by the IF function are thus TRUE when both An and Bn are TRUE, and FALSE otherwise. The IF returns the value in Dn when the logical condition is TRUE, and zero otherwise, so the total returned by the function is the sum of all cells in column D where the values in column A and B of that row are both true.

      Because we are dealing with boolean values, it is possible to trim this function further, to:

      {=SUM(IF($A$1:$A$15*$B$1:$B$15=1,$D$1:$D$15,0))}

      since they will only take on the values of 1 or 0, and we know we want to test for them both being TRUE or 1. This ‘streamlined’ version does not generalize well, however.

      As for the function above using a synthesized result column and a SUMIF function, this approach can be generalized by testing a reference rather than a constant value, as follows:

      {=SUM(IF(($A$1:$A$15=A24)*($B$1:$B$15=B24) =1,$D$1:$D$15,0))}

      where the values set in A24 and B24 determine whether a particular row is included or excluded in the total. Although this may not be required when the data is nicely arranged with boolean values to test, it can be very useful if the values are more nomal variables, where, for instance

      {=SUM(IF(($A$33:$A$47>A51)*($B$33:$B$47=B51) =1,$D$33:$D$47,0))}

      could be used to identify items greater than a certain size, and related to a particular subset of the original data. I find that array formulas can be used in this way to provide similar utility as pivot tables, but they are easier to set up, particularly for “minor” applications – I use an array formula to track my billable hours by client, for instance, where a pivot table would be overkill.

      I have included a s/sheet (XL 2K) with example of all these calculations – array formulas are easier to see than to explain.

      • #544809

        Another way could be to use VBA.

        It may well be clearer to maintain.

        If your formulas are straightforward, it will probably run slower. But if you have a spreadsheet with hundereds (or even thousands) of array formulas, it may well run faster in VBA. I have had that situation, and changing it to VBA saved heaps of calc time.

        It was not a straight convert- I had to change around the way the whole thing worked- but the spreadsheet becamse just som much easier to maintain, as well as being faster.

        It probably doesn’y apply in this case- but it’s worth bearing in mind.

        • #545544

          Dean has asked me to exapnd on this.

          I’m quite sure that a worksheet function will be faster than a VBA solution.

          My situation wqas that I had several thousand rows in the Excel spreadsheet. I had several hundred array formulas around the worksheet. The user would change a geographic area, and the worksheet would take over a minute to calculate. So each worksheet function had to scan the whole array to calculate- so perhaps 200,000 calculations.

          I changed the whole way the spreadsheet worked. When the user changed an area, the VBA scanned the array just once, and populated the several hundred cells with results of calculations. So only thousands of calculations involved. The worksheet only contained simple formulas now- and the result was spectacularly faster.

          • #545578

            Geoff (and anyone else watching ):

            It sounds to me like one of two things, and perhaps a little of both, were happening:

            1) The array formulas were duplicating the same calculation and / or referencing the same source ranges in the calculation. This meant that Excel was running through the examinations of the same cells numerous times (worst case, 100 times, if you had 100 array formulas on the sheet). Even though array formulas evaluate much faster than VBA functions, a VBA sub that performed multiple evaluations on one pass and then dumped the result(s) into specified locations was still faster, since you avoided the 100 read/compare cycles of the array functions.

            2) By limiting the sub to run when cells in a given range were changed (by testing whether the active cell was in a specified range following the on-entry or on-change event, perhaps) the number of times the VBA sub was called could be reduced.

            I can see how either of these approaches could improve the sheet responsiveness by reducing the time spent recalculating after every entry. Were these the approaches you used, and did you have other calculation-saving tweeks in place?

            • #545590

              Dean,

              Mostly #1.

              So, for instance, cell A1 gave me a count or all cells where column 1 was 1 value, column 2 another, and column 5 another. Cell A2 was checking the same cells for a different combination of values (a gross simplification, but it will do).

              Cell 2 checked a different column combinations.

              #2 did not apply. When a different geographical area was selected, virtually the whole sheet changed.

              Yes, I’m sure I had other tweaks. They all made differences- but undoubtedly, the biggest one was using the VBA. And, for a an extra benefit, when I had to change the basic formula, I had to only change it in about 5 lines of VBA, not 100 times in complex cell formulas.

              fwiw, I did have one change elsewhere in the worksheet which gave a big speed improvement. in a different situation.

              The source data (the several thousand rows) came from an external text file. I would read one record, and set each of about 12 cells in the next empty row. This process was slow. There was a “critical mass” where it slowed down dramatically when I added one more column. I found that building an array of all values in the VBA, and then at the end moving that array into the worksheet really speeded things up.

            • #545639

              Geoff

              Your statement: “then at the end moving that array into the worksheet really speeded things up” really caught my attention.

              I am working on something similar (importing ~ 50,000 lines of csv values into an excel workbook) and would love to see an example of the code you used for your “move-array” function. I am confident I can build the array without problems but have not found any examples of how to then move the array into a row of cells. I currently place each value individually… and yes, it is SLOW!

            • #545668

              Fred,

              Some sample code:

              Dim svalues(3, 3) As Variant
              Dim i As Integer
              Dim j As Integer
              
              For i = 0 To 3
                For j = 0 To 3
                  svalues(i, j) = i * 10 + j
                Next j
              Next i
              
              ActiveSheet.Range(Cells(3, 1), Cells(6, 4)) = svalues

              In fact, I did two passes reading the sequential file to load the array- one just to get the record count and to dimension the array, and the second one to populate it.

              hth

            • #545695

              Geoff

              Thank you. That

            • #545825

              Fred,

              Sorry, Idon’t know the answer.

              I was handling an array of perhaps 2,000 by 15, copying into Excel only once at the end, and it was working OK..

              I’d suggest “suck it and see”.

              Perhaps the best approach might be to load arrays of say, 50000 elements, and each time that array filled- and obviously at the end of the process- copy it into the spreadsheet.

              But that’s more work. If it works OK to copy one whole line at a time, do that. If it works to copy in 8 arrays only at the end of processing, try that. And only if there’s problems for memory or perfornance- try the hybrid approach.

              Good luck- I’d be interested to know how you go.

            • #545993

              Geoff

              I re-coded in an Array(8), vice 8 individual variable and found a slight (minimal) speed improvement. Not much of one, but enough to give me hope.

              I was wrong about not having any trouble with the array. It took a fair amount of tinkering to get the darn thing to do what I wanted! brickwall

              I was all set to re-code again, this time using 8 Array(8, x) that I would redim as x increased, to then create each of the 8 sheets from their respective arrays all at once… but someone else pointed me towards using Excels database objects instead.

              I recorded a macro of Excel doing the “Get External Data” text import (the results are not pretty, but it is VERY fast) and discovered that it uses “QueryTables.Add”. I think I’ll go bark up that tree for a while… and see if it gets me anywhere. confused

              Thanks for your help… and be prepared… I may very well end up back using the large multi-dimensional array approach… in which case I will probably be asking for help again! devil

              Either way, I will keep you posted.

            • #546033

              Fred,

              Sounds like a good idea. Something I hadn’t thought of, for sure.

              But it’s too late for me- my Excel solution was an interim one (even though ot lasted for 12 months), and has now been replaced.

              I’m glad we could help each other (to some extent!)

    Viewing 0 reply threads
    Reply To: SUMIF using 2 conditions (Excel 97)

    You can use BBCodes to format your content.
    Your account can't use all available BBCodes, they will be stripped before saving.

    Your information: