Overview

Data processing is one of the core steps of data visualization. This section will introduce how PivotTable organizes and processes data, enabling data to support efficient rendering of PivotTable while also possessing PivotTable data analysis capabilities.

Automatic Organization of Dimension Tree

Background of the Requirement

Using our diagram: \r

Suppose we want to implement such a multidimensional table, generally speaking, the parameters we expect from the business side are:

Well-organized dimension trees RowTree, ColumnTree (similar to timeTree and channelTree) \r
Specific data records under various dimensions and indicators \r

In theory, it can be achieved, but the drawbacks are also obvious: the business side needs to assemble the data into this structure by themselves, which has a high integration cost. We expect the business side to only pass concise Records with some simple configurations, and we can parse the data ourselves and render it into a multidimensional table. For example, the Records passed in are the original data found from the db: \r

const records = [
    {
        channel: "线上",
        platform: "淘宝",
        shop: "淘宝旗舰店",
        month: 3,
        day: 2,
        curr_price: 3999
    },
    {
        channel: "线上",
        platform: "京东",
        shop: "京东三方店",
        month: 3,
        day: 3,
        origin_price: 4399,
    },
    ...
]

Objective: Transform the original dataset `Records` through data processing to obtain a data structure that supports display in pivot table format

Implementation Approach

Analysis

With the above background and objectives, some questions may easily arise: \r

How to generate rowTree, columnTree from raw data?
Answer: Group aggregation. Similar to SQL's group, theoretically we can sort out the values of each dimension from records in a way similar to group (e.g., group aggregation to find the dimension values under platform such as "Taobao" | "JD" | "Douyin").
How to ensure the lowest time complexity and pursue performance when the data volume of records is large?

Approach

Convention for user-provided data & data structure

const datasetOptions = {
  // 原始数据
  records: [
    {
      channel: '线上',
      platform: '淘宝',
      shop: '淘宝期间店',
      month: 3,
      week: 1,
      day: 3,
      origin_price: 4399,
    },
    {
      channel: '线上',
      platform: '淘宝',
      shop: '淘宝三方店',
      month: 3,
      week: 1,
      day: 4,
      curr_price: 4099
    }
    ...
  ],
  // rowTree 和 columnTree 中各维度层次在原始数据中的key
  columns: ['channel', 'platform', 'shop'],
  rows: ['month', 'week', 'day'],
  // 指标在 原始数据 中的key
  indicatorKeys: ['origin_price', 'curr_price']
};

Task
Collect dimension members (e.g., under the platform dimension there are "Taobao" | "JD" | "Douyin" three members)
Assemble rowTree, columnTree
When rendering, quickly search for the corresponding data of the cell from records (as shown in the figure) \r

Theoretically, based on the known tasks: \r

Traversing the records once can accomplish the task of "collecting dimension members"; based on the collected dimension members and the columns, rows, indicatorKeys passed in by the user, theoretically, it is possible to assemble the rowTree and columnTree.
But how do we know the parent-child relationship of these dimensions? How do I know that the shop dimension is actually a sub-dimension of the platform dimension? \r
When users pass columns, the parent dimension should be sorted before the child dimension, e.g.:

// ❌ bad 
const options = {
  columns: ['channel', 'shop', 'platform'],
  ...
};

// ✅ good
const options = {
  columns: ['channel', 'platform', 'shop'],
  ...
};

But the issue of "quickly finding the corresponding data from records when rendering" is quite troublesome. Suppose we know the row dimension + column dimension of the cell, we need to implement the getCellValue(col: number, row: number) function. Do we have to iterate over records again? That would be too cumbersome.
The most efficient method: By leveraging the capabilities of a **hash map**, the time complexity of lookup can be reduced to O(1). So how to design the structure of a hash map? \r

In fact, the data area is a two-dimensional matrix, so you can use (row, col) to locate the position of each cell. Therefore, if we have a two-dimensional hash map, its structure is roughly as follows, which can be used to look up cell data.

// HashMap 的第一层 key 为 row，第二层 key 为 col
type HashMap = Record<string, Record<string, IndicatorValue[]>>

// 指标值
type IndicatorValue = {
    indicatorKey: string;
    value: string;
}

In our requirement, how do we define the structure of a hash map with two layers of keys? To ensure uniqueness, we can use the string composed of the path from root to leaf node in rowTree and columnTree as the key (as shown in the diagram and code below).

// 指标值
type IndicatorValue = {
    indicatorKey: string;
    value: string;
}

// TreeMap 的第一层 key 为 row-path，第二层 key 为 col-path
type TreeMap = Record<string, Record<string, IndicatorValue[]>>

// 数据维度 tree 对象
const tree: TreeMap = {
    '3-1-3': {
        '线上-淘宝-淘宝旗舰店': [
            {
                indicatorKey: "origin_price",
                value: '4299'       
            }
            {
                indicatorKey: "curr_price",
                value: '3999'       
            }
        ]
    }
}

Data Parsing Process

With the above analysis, let's go through the data parsing process \r

Traverse

Variables to maintain during data traversal:

// 列维度成员组成的数组
colKeys: string[][]
// 行维度成员组成的数组
rowKeys: string[][]

During the process of traversing the data, a dimension tree object (also referred to as a hash map in the previous text; for consistency, it will be referred to as a "dimension tree object" below) will also be generated.

// 数据维度 tree 对象
// 第一层 key 实际就是 colKeys 的元素，再 join 得到的字符串
// 第二层 key 为 rowKeys 的元素经过 join 得到的字符串
tree: Record<string, Record<string, IndicatorValue[]>> = {
    '3-1-3': {
        '线上-淘宝-淘宝旗舰店': [
            {
                indicatorKey: "origin_price",
                value: '4299'       
            }
            {
                indicatorKey: "curr_price",
                value: '3999'       
            }
        ]
    }
}

Search

With the dimension tree object, during rendering, you can quickly find the corresponding data for the cell from records.

Source Code

According to the above analysis process, let's take a look at how the source code is implemented.

Code entry: `packages/vtable/src/dataset/dataset.ts` The following code has been simplified

setRecords: Entry method for data processing
processRecords: Process data, iterate through all entries
processRecord: Process a single piece of data, we have implemented most of the analysis process in this function \r

Traverse

In previous assumptions, we imagined that the dimension path would be 'online-Taobao-Taobao flagship store', which would be problematic because dimension members might also contain the '-' string.
In the source code, String.fromCharCode(0) is used as the separator for the dimension path, i.e., \u0000. In JavaScript, \u0000 represents the character with Unicode encoding U+0000, which is the null character (Null Character). This character is usually used to indicate the end of a string or as a placeholder, but it is typically not displayed in actual rendering. Here, it is mainly used to ensure the uniqueness of the dimension path string.

class Dataset {
    colKeys: string[][] = [];
    rowKeys: string[][] = [];
    private colFlatKeys: Record<string, number> = {}; // 记录某个colKey已经被添加到colKeys
    private rowFlatKeys: Record<string, number> = {}; // 记录某个rowKey已经被添加到rowKeys
    tree: Record<string, Record<string, Aggregator[]>> = {};
    
    stringJoinChar = String.fromCharCode(0); // 维度 path 的分隔符
    
    setRecords(records: any[] | Record<string, any[]>) {
        this.processRecords();
        
        ...
    }
    
    // 处理数据, 遍历所有条目
    private processRecords() {
        ...
        for (let i = 0, len = this.records.length; i < len; i++) {
            const record = this.records[i];
            
            ...
            this.processRecord(record);
        }
    }
    
    // 处理单条数据
    private processRecord(record: any, assignedIndicatorKey?: string) {
        ...
      
        const colKeys: { colKey: string[]; indicatorKey: string | number }[] = [];
        const rowKeys: { rowKey: string[]; indicatorKey: string | number }[] = [];
        
        // 收集维度成员
        const rowKey: string[] = [];
        rowKeys.push({ rowKey, indicatorKey: assignedIndicatorKey });
        for (let l = 0, len1 = this.rows.length; l < len1; l++) {
            const rowAttr = this.rows[l];
            if (rowAttr in record) {
                this.rowsHasValue[l] = true;
                **rowKey.push(record[rowAttr]);**
            }
        }
        
        const colKey: string[] = [];
        colKeys.push({ colKey, indicatorKey: assignedIndicatorKey });
        for (let n = 0, len2 = this.columns.length; n < len2; n++) {
            const colAttr = this.columns[n];
            if (colAttr in record) {
                this.columnsHasValue[n] = true;
                **colKey.push(record[colAttr]);**
            }
        }
        
        for (let row_i = 0; row_i < rowKeys.length; row_i++) {
            const rowKey = rowKeys[row_i].rowKey;
            ...
            
            for (let col_j = 0; col_j < colKeys.length; col_j++) {
                const colKey = colKeys[col_j].colKey;
                
                // 生成 flatRowKey，将用于维度tree对象的key
                **const flatRowKey = rowKey.join(this.stringJoinChar);**
**                const flatColKey = colKey.join(this.stringJoinChar);**
                
                ...
                
                if (rowKey.length !== 0) {
                  if (!this.rowFlatKeys[flatRowKey]) {
                    **this.rowKeys.push(rowKey);**
                    this.rowFlatKeys[flatRowKey] = 1;
                  }
                }
                if (colKey.length !== 0) {
                  if (!this.colFlatKeys[flatColKey]) {
                    **this.colKeys.push(colKey);**
                    this.colFlatKeys[flatColKey] = 1;
                  }
                }
        
                if (!this.tree[flatRowKey]) {
                  this.tree[flatRowKey] = {};
                }
                
                // 生成维度 tree 对象
                if (!this.tree[flatRowKey]?.[flatColKey]) {
                  this.tree[flatRowKey][flatColKey] = [];
                }
                
                const toComputeIndicatorKeys = this.indicatorKeysIncludeCalculatedFieldDependIndicatorKeys;
                for (let i = 0; i < toComputeIndicatorKeys.length; i++) {
                    let needAddToAggregator = false;
                    
                    ...
                    
                    // 生成维度 tree 对象
                    if (needAddToAggregator) {
                        **this.tree[flatRowKey]?.[flatColKey]?.[i].push(record);**
                    }
                }
                ...
                
            }
        }
        
    }
}

Assemble to generate

ArrToTree and ArrToTree1: Convert rowKeys and colKeys to a tree structure

private ArrToTree1(
    arr: string[][],
    rows: string[],
    indicators: (string | IIndicator)[] | undefined,
    ...
  ): {
     {
       value: string; 
       dimensionKey: string;
       children: any[] | undefined
     } 
  }[] {
    const result: any[] = []; // 结果
    const concatStr = this.stringJoinChar; // 连接符(随便写，保证key唯一性就OK)
    const map = new Map(); // 存储根节点 主要提升性能
    
    function addList(list: any, isGrandTotal: boolean) {
      const path: any[] = []; // 路径
      let node: any; // 当前节点
      list.forEach((value: any, index: number) => {
        path.push(value);
        const flatKey = path.join(concatStr);
        //id的值可以每次生成一个新的 这里用的path作为id 方便layout对象获取
        let item: { value: string; dimensionKey: string; children: any[] | undefined } = map.get(flatKey); // 当前节点
        if (!item) {
          item = {
            value,
            dimensionKey: rows[index],
            //树的叶子节点补充指标
            children:
              index === list.length - 1 && (indicators?.length ?? 0) >= 1
                ? indicators?.map(indicator => {
                    if (typeof indicator === 'string') {
                      return {
                        indicatorKey: indicator,
                        value: indicator
                      };
                    }
                    return {
                      indicatorKey: indicator.indicatorKey,
                      value: indicator.title
                    };
                  })
                : []
          };

          map.set(flatKey, item); // 存储路径对应的节点
          if (node) {
            node.children.push(item);
          } else {
            if (showGrandTotalsOnTop && isGrandTotal) {
              result.unshift(item);
            } else {
              result.push(item);
            }
          }
        }
        node = item; // 更新当前节点
      });
    }

    arr.forEach(item => addList(item, false));
    ...
    
    return result;
}

Search

PivotTable can obtain cell values from the dataset module through methods like pivotTable.getCellValue. These methods will eventually call the dataset.getAggregator method.

It can be seen that it is directly read through flatRowKey + flatColKey + indicatorIndex on the dimension tree object, which is very convenient, and the time complexity can almost be regarded as O(1)

getAggregator(
    rowKey: string[] | string = [],
    colKey: string[] | string = [],
    indicator: string,
    considerChangedValue: boolean = true,
    indicatorPosition?: { position: 'col' | 'row'; index?: number }
  ): IAggregator {
    const indicatorIndex = this.indicatorKeys.indexOf(indicator);

    ...
    
    const agg = this.tree[flatRowKey]?.[flatColKey]?.[indicatorIndex];
    
    return agg
}

Data Update Status

Add: In a tree display scenario, if you need to dynamically insert child node data, you may use the setTreeNodeChildren interface -> call the addRecords interface -> trigger processRecord
Change: In the table editing scenario, the value of the cell may be updated, and pivotTable.changeCellValues and pivotTable.changeCellValue will be called to change the cell data
In addition to triggering the recalculation of width and height, the above method will ultimately trigger the dataset.changeRecordFieldValue method (as shown in the code below) during data processing. It can be seen that records will be updated first; then this.processRecords() is called to start traversing records again, regenerating the dimension tree object.

changeRecordFieldValue(fieldName: string, oldValue: string | number, value: string | number) {
  ...
  
  for (let i = 0, len = this.records.length; i < len; i++) {
      const record = this.records[i];
      if (record[fieldName] === oldValue) {
        **record[fieldName] = value;**
      }
  }
  
  this.rowFlatKeys = {};
  this.colFlatKeys = {};
  this.tree = {};
  **this.processRecords();**
}

Data Analysis

Background of the Requirement

One of the core functions of multidimensional tables is data analysis, which can help users analyze various scenario indicators and comparisons, aiding business analysis to drive decision-making. The following are the data analysis capabilities of PivotTable.

Requirements Analysis

In the previous section "Automatic Organization of Dimension Tree", we imagined the dimension tree object value as the IndicatorValue[] type. To implement the functions of aggregation and calculated fields, the data structure of the dimension tree object value needs to be redesigned. What should the data structure be? How can we perform statistics on these aggregated data while traversing **Records**? This is actually one of the core designs of this section.

// 我们之前设想据维度 tree 对象
const tree = {
    '3-1-3': {
        '线上-淘宝-淘宝旗舰店': [
            {
                indicatorKey: "origin_price",
                value: '4299'       
            }
            {
                indicatorKey: "curr_price",
                value: '3999'       
            }
        ]
    }
}

The filtering and derived field functions can be implemented before traversing Records. According to the agreed filtering rules, unnecessary data is removed from Records first, without affecting subsequent calculations.

What is the difference between calculated fields and derived fields? Both are data derived from the original data. Derived fields: **Dimensions** derived from the original data. eg. There is a dimension date field with the value "2025-02-03", and it is expected to derive dimensions year, month, week, day Calculated fields: **Metrics** derived from the original data. eg. There are metrics "original price" and "actual price", and it is expected to derive the metric "discount strength"

1. Summarization is a commonly used feature in multidimensional tables. It may be implemented **after traversing** `**Records**`, because we need to aggregate and calculate fields after they have values, only then can we perform summarization.

According to the above analysis, in order to achieve data analysis functionality, the data parsing process may change as follows:

Source Code & Implementation

With the above analysis and questions, let's take a look at how the source code is implemented.

Code entry: `packages/vtable/src/dataset/dataset.ts` The following code has been simplified

These logics are mainly distributed in the setRecords, processRecords, and processRecord methods.

setRecords: Entry method for data processing \r
processRecords: Process data, iterate through all entries
processRecord: Process a single data entry

Filter

The source code is as follows, it should be understandable

export class Dataset {
    // 过滤规则
    filterRules?: FilterRules;
    
    // 明细数据
    records?: any[] | Record<string, any[]>;
    filteredRecords?: any[] | Record<string, any[]>;
    
    // 处理数据, 遍历所有条目
    private processRecords() {
        let isNeedFilter = false;
        if ((this.filterRules?.length ?? 0) >= 1) {
          isNeedFilter = true;
        }
        
        for (let i = 0, len = this.records.length; i < len; i++) {
            const record = this.records[i];
            // 如果 this.filterRecord(record) 为false，这条原始数据就被过滤掉了，不进入后面的数据处理流程
            if (!isNeedFilter || **this.filterRecord(record)**) {
                (this.filteredRecords as any[]).push(record);
                this.processRecord(record);
            }
          }
    }
    
    // 遍历过滤规则，有一条命中就会被过滤掉
    private filterRecord(record: any): boolean {
        let isReserved = true;
        if (this.filterRules) {
            for (let i = 0; i < this.filterRules.length; i++) {
                const filterRule = this.filterRules[i];
                if (filterRule.filterKey) {
                    const filterValue = record[filterRule.filterKey];
                    if (filterRule.filteredValues?.indexOf(filterValue) === -1) {
                        isReserved = false;
                        break;
                    }
                } else if (!filterRule.filterFunc?.(record)) {
                    isReserved = false;
                    break;
                }
            }
        }
        return isReserved;
    }
}

Derived Fields

export class Dataset {
    // 派生字段规则
    derivedFieldRules?: DerivedFieldRules;
    
    // 处理单条数据
    private processRecord(record: any, assignedIndicatorKey?: string) {
        this.derivedFieldRules?.forEach((derivedFieldRule: DerivedFieldRule, i: number) => {
            if (derivedFieldRule.fieldName && derivedFieldRule.derivedFunc) {
                // 根据派生字段规则的 fieldName 和 函数，生成字段数据，写入 record 中
                record[derivedFieldRule.fieldName] = derivedFieldRule.derivedFunc(record);
            }
        });
    
    }
}

Aggregation

Aggregator Class

Implement respective aggregation classes based on the Aggregator class according to different AggregationType.

export enum AggregationType {
  RECORD = 'RECORD',
  NONE = 'NONE', //不做聚合 只获取其中一条数据作为节点的record 取其field
  SUM = 'SUM',
  MIN = 'MIN',
  MAX = 'MAX',
  AVG = 'AVG',
  COUNT = 'COUNT',
  CUSTOM = 'CUSTOM',
  RECALCULATE = 'RECALCULATE' // 计算字段
}

// packages/vtable/src/ts-types/dataset/aggregation.ts
export interface IAggregator {
  records: any[];  // 缓存聚合值的records集合，为后续跟踪提供数据依据
  value: () => any; // 获取聚合值
  push: (record: any) => void; // 将数据记录添加到聚合器中，用于计算聚合值
  deleteRecord: (record: any) => void; // 从聚合器中删除记录，并更新聚合值。eg. 调用vtable的删除接口deleteRecords会调用该接口
  updateRecord: (oldRecord: any, newRecord: any) => void; // 更新数据记录，并更新聚合值。eg. 调用接口updateRecords会调用该接口
  recalculate: () => any; // 重新计算聚合值。eg. 目前复制粘贴单元格值会调用该方法。
  formatValue?: (col?: number, row?: number, table?: BaseTableAPI) => any; // 格式化后的聚合值
  formatFun?: () => any; // 格式化函数
  clearCacheValue: () => any; // 清空缓存值
  reset: () => void; // 重置聚合器
}

export abstract class Aggregator implements IAggregator {
  isAggregator?: boolean = true;
  isRecord?: boolean = true; //是否需要维护records 将数据源都记录下来
  records: any[] = [];
  type?: string;
  key: string;
  field?: string | string[];
  formatFun?: any;
  _formatedValue?: any;

  constructor(config: { key: string; field: string | string[]; formatFun?: any; isRecord?: boolean }) {
    this.key = config.key;
    this.field = config.field;
    this.formatFun = config.formatFun;
    this.isRecord = config.isRecord ?? this.isRecord;
  }
  abstract push(record: any): void;
  abstract deleteRecord(record: any): void;
  abstract updateRecord(oldRecord: any, newRecord: any): void;
  abstract value(): any;
  abstract recalculate(): any;
  clearCacheValue() {
    this._formatedValue = undefined;
  }
  formatValue(col?: number, row?: number, table?: BaseTableAPI) {
     ...
  }
  reset() {
    this.records = [];
    this.clearCacheValue();
  }
}

// 基于 Aggregator 实现各自的聚合类
export class SumAggregator extends Aggregator {
    ...
}
export class CountAggregator extends Aggregator {
    ...
}
...

The value of 维度tree对象 is actually Aggregator[]

tree: Record<string, Record<string, Aggregator[]>> = {};

We choose AggregationType.SUM and AggregationType.RECALCULATE (calculated fields) to specifically analyze the implementation process.

SumAggregator

The general process is as follows: \r

// packages/vtable/src/ts-types/dataset/aggregation.ts
export const registeredAggregators: {
  [key: string]: {
    new (args: {
      key?: string;
      field: string | string[];
      aggregationFun?: any;
      formatFun?: any;
      isRecord?: boolean;
      needSplitPositiveAndNegative?: boolean;
      calculateFun?: any;
      dependAggregators?: any;
      dependIndicatorKeys?: string[];
    }): Aggregator;
  };
} = {};

// packages/vtable/src/dataset/dataset.ts
export class Dataset {
    // 聚合规则
    aggregationRules?: AggregationRules;
    
     // 将聚合类型注册收集到 registeredAggregators 对象，方便后面调用
    registerAggregator(type: string, aggregator: any) {
        registeredAggregators[type] = aggregator;
    }
    
    // 将聚合类型注册。 在 constructor 一开始就会执行
    registerAggregators() {
        this.registerAggregator(AggregationType.RECORD, RecordAggregator);
        this.registerAggregator(AggregationType.SUM, SumAggregator);
        this.registerAggregator(AggregationType.COUNT, CountAggregator);
        this.registerAggregator(AggregationType.MAX, MaxAggregator);
        this.registerAggregator(AggregationType.MIN, MinAggregator);
        this.registerAggregator(AggregationType.AVG, AvgAggregator);
        this.registerAggregator(AggregationType.NONE, NoneAggregator);
        this.registerAggregator(AggregationType.RECALCULATE, RecalculateAggregator);
        this.registerAggregator(AggregationType.CUSTOM, CustomAggregator);
    }
    
     
    // 处理单条数据
    private processRecord(record: any, assignedIndicatorKey?: string) {
        ...
        const toComputeIndicatorKeys = this.indicatorKeysIncludeCalculatedFieldDependIndicatorKeys;
        for (let i = 0; i < toComputeIndicatorKeys.length; i++) {
            
            ...
            // aggRule 有可能为空，具体看是什么指标字段
            const aggRule = this.getAggregatorRule(toComputeIndicatorKeys[i]);
            let needAddToAggregator = false;
            
            ...
            
            // 如果这个 indicatorKey 在 record 中，就会触发下面的逻辑
            toComputeIndicatorKeys[i] in record && (needAddToAggregator = true);
            if (!this.tree[flatRowKey]?.[flatColKey]?.[i] && needAddToAggregator) {
                // 若这个 indicatorKey 没指定 aggRule，就默认是 AggregationType.SUM
                // 往维度tree对象的值中添加 Aggregator 实例
                this.tree[flatRowKey][flatColKey][i] = new registeredAggregators[
                    **aggRule?.aggregationType ?? AggregationType.SUM**
                ]({
                    key: toComputeIndicatorKeys[i],
                    field: aggRule?.field ?? toComputeIndicatorKeys[i],
                    aggregationFun: aggRule?.aggregationFun,
                    formatFun:
                        aggRule?.formatFun ??
                        (
                          this.indicators?.find((indicator: string | IIndicator) => {
                              if (typeof indicator !== 'string') {
                                  return indicator.indicatorKey === toComputeIndicatorKeys[i];
                              }
                              return false;
                          }) as IIndicator
                        )?.format
              });
            }
            
            if (needAddToAggregator) {
                // 并调用 Aggregator 实例的 push 方法，往 Aggregator.records 中存原始数据
                this.tree[flatRowKey]?.[flatColKey]?.[i].push(record);
            }
        }
    }
}

// packages/vtable/src/ts-types/dataset/aggregation.ts
export class SumAggregator extends Aggregator {
    type: string = AggregationType.SUM;
    sum = 0;
    ...
    
    push(record: any): void {
        if (record) {
            if (this.isRecord && this.records) {
                if (record.isAggregator) {
                    this.records.push(...record.records);
                } else {
                    this.records.push(record);
            }
        }
        
        ...
        const value = parseFloat(record[this.field]);
        this.sum += value;
        if (this.needSplitPositiveAndNegativeForSum) {
          if (value > 0) {
            this.positiveSum += value;
          } else if (value < 0) {
            this.nagetiveSum += value;
          }
        }
    
        this.clearCacheValue();
    }
    
    // 获取 sum 值
    value() {
        return this.records?.length >= 1 ? this.sum : undefined;
    }
  
  ...
}

Calculated Fields

The Aggregator class for calculating fields is RecalculateAggregator. The processing flow is also very similar to the flow of SumAggregator.

export class Dataset {
    // 计算字段规则
    calculatedFieldRules?: CalculateddFieldRules;
    /** 计算字段 */
    calculatedFiledKeys?: string[];
    calculatedFieldDependIndicatorKeys?: string[];
    
    // 处理单条数据
    private processRecord(record: any, assignedIndicatorKey?: string) {
        ...
        const toComputeIndicatorKeys = this.indicatorKeysIncludeCalculatedFieldDependIndicatorKeys;
        for (let i = 0; i < toComputeIndicatorKeys.length; i++) {
            // 遍历计算字段key
            if (this.calculatedFiledKeys.indexOf(toComputeIndicatorKeys[i]) >= 0) {
                // 找到计算字段对应的计算规则
                const calculatedFieldRule = this.calculatedFieldRules?.find(rule => rule.key === toComputeIndicatorKeys[i]);
                
                if (!this.tree[flatRowKey]?.[flatColKey]?.[i]) {
                    // 往 维度tree 添加新 RECALCULATE Aggregator 实例
                    this.tree[flatRowKey][flatColKey][i] = new registeredAggregators[AggregationType.RECALCULATE]({
                    key: toComputeIndicatorKeys[i],
                    field: toComputeIndicatorKeys[i],
                    isRecord: true,
                    formatFun: (
                      this.indicators?.find((indicator: string | IIndicator) => {
                        if (typeof indicator !== 'string') {
                          return indicator.indicatorKey === toComputeIndicatorKeys[i];
                        }
                        return false;
                      }) as IIndicator
                    )?.format,
                    calculateFun: calculatedFieldRule?.calculateFun,
                    dependAggregators: this.tree[flatRowKey][flatColKey],
                    dependIndicatorKeys: calculatedFieldRule?.dependIndicatorKeys
                  });
                }
                
                // 将依赖的原始数据 record 存进 RECALCULATE Aggregator.records 中
                this.tree[flatRowKey]?.[flatColKey]?.[i].push(record);
            }
        }
    }
}

// packages/vtable/src/ts-types/dataset/aggregation.ts
export class RecalculateAggregator extends Aggregator {
    type: string = AggregationType.RECALCULATE;
    isRecord?: boolean = true;
    declare field?: string;
    calculateFun: Function;
    fieldValue?: any;
    dependAggregators: Aggregator[];
    dependIndicatorKeys: string[];
    
    ...
    
    push(record: any): void {
        if (record && this.isRecord && this.records) {
            if (record.isAggregator) {
                this.records.push(...record.records);
            } else {
                this.records.push(record);
            }
        }
        this.clearCacheValue();
    }
    
    // 获取计算字段的值
    value() {
        if (!this.fieldValue) {
            // 获取依赖的 Aggregator 的值
            const aggregatorValue = _getDependAggregatorValues(this.dependAggregators, this.dependIndicatorKeys);
            // 再用 calculateFun 算出计算字段的值
            this.fieldValue = this.calculateFun?.(aggregatorValue, this.records, this.field);
        }
        return this.fieldValue;
    }
}

Summary

This process is quite troublesome ... \r

export class Dataset {
    // 汇总配置
    totals?: Totals;
    // 全局统计各指标的极值
    indicatorStatistics: { max: Aggregator; min: Aggregator; total: Aggregator }[] = [];
    // 缓存rows对应每个值是否为汇总字段
    private rowsIsTotal: boolean[] = [];
    private colsIsTotal: boolean[] = [];
    private colGrandTotalLabel: string;
    private colSubTotalLabel: string;
    private rowGrandTotalLabel: string;
    private rowSubTotalLabel: string;
    // 记录用户传入的汇总数据
    totalRecordsTree: Record<string, Record<string, Aggregator[]>> = {};
    
    setRecords(records: any[] | Record<string, any[]>) {
        ...
        // 处理汇总. 在 this.processRecords() 之后；在排序之前
        this.totalStatistics();
    }
    
    
    // 汇总小计
    totalStatistics() {
        // 如果 row 或 column有汇总配置
        if (...) {
            const rowTotalKeys: string[] = [];
            
            // 遍历维度 tree 中的每个行维度、列维度
            Object.keys(that.tree).forEach(flatRowKey => {
                const rowKey = flatRowKey.split(this.stringJoinChar);
                Object.keys(that.tree[flatRowKey]).forEach(flatColKey => {
                    // 如果 row 有小计
                    if (...) {
                        for (let i = 0, len = that.totals?.row?.subTotalsDimensions?.length; i < len; i++) {
                            // 取有小计配置的 row 维度
                            const dimension = that.totals.row.subTotalsDimensions[i];
                            const dimensionIndex = that.rows.indexOf(dimension);
                            
                            const rowTotalKey = rowKey.slice(0, dimensionIndex + 1);
                            if (this.rowHierarchyType !== 'tree') {
                                // 如果是tree的情况则不追加小计单元格值
                                rowTotalKey.push(that.rowSubTotalLabel);
                            }
                            
                            if (!this.tree[flatRowTotalKey]) {
                                this.tree[flatRowTotalKey] = {};
                                rowTotalKeys.push(flatRowTotalKey);
                            }
                            if (!this.tree[flatRowTotalKey][flatColKey]) {
                                this.tree[flatRowTotalKey][flatColKey] = [];
                            }
                            
                            // 和之前处理聚合的逻辑类似
                            // 会往维度tree该行列维度中添加 Aggreator 实例
                            const toComputeIndicatorKeys = this.indicatorKeysIncludeCalculatedFieldDependIndicatorKeys;
                            for (let i = 0; i < toComputeIndicatorKeys.length; i++) {
                                if (!this.tree[flatRowTotalKey][flatColKey][i]) {
                                    if (this.calculatedFiledKeys.indexOf(toComputeIndicatorKeys[i]) >= 0) {
                                    
                                        ...
                                        const aggRule = this.getAggregatorRule(toComputeIndicatorKeys[i]);
                                        this.tree[flatRowTotalKey][flatColKey][i] = new registeredAggregators[                                            aggRule?.aggregationType ?? AggregationType.SUM                                        ]({
                                            key: toComputeIndicatorKeys[i],
                                            field: aggRule?.field ?? toComputeIndicatorKeys[i],
                                            formatFun:
                                              aggRule?.formatFun ??
                                              (
                                                this.indicators?.find((indicator: string | IIndicator) => {
                                                  if (typeof indicator !== 'string') {
                                                    return indicator.indicatorKey === toComputeIndicatorKeys[i];
                                                  }
                                                  return false;
                                                }) as IIndicator
                                              )?.format
                                        });
                                    }
                                 }
                                 // 这一步有点意思，会往维度tree中 flatRowTotalKey 维度的 aggreator 添加所有flatRowKey下的 aggreator
                                 if (flatRowTotalKey !== flatRowKey) {
                                    this.tree[flatRowTotalKey][flatColKey][i].push(that.tree[flatRowKey]?.[flatColKey]?.[i]);
                                 }
                            }
                        }
                    }
                    
                    // 如果 row 有总计配置, 也做类似的处理
                    if (that.totals?.row?.showGrandTotals || this.columns.length === 0) {
                        ...
                    }
                    
                    colCompute(flatRowKey, flatColKey);
                })
                
                // 增加出来的rowTotalKeys 再遍历一次 汇总小计的小计 如 东北小计（row）-办公用品小计（col）所指单元格的值
                rowTotalKeys.forEach(flatRowKey => {
                    Object.keys(that.tree[flatRowKey]).forEach(flatColKey => {
                        // 计算每一行的所有列的汇总值
                        colCompute(flatRowKey, flatColKey);
                    })
                })
                
            })
            
            for (const flatRowKey in that.totalRecordsTree) {
                for (const flatColKey in that.totalRecordsTree[flatRowKey]) {
                    // 计算每一行的所有列的汇总值
                    colCompute(flatRowKey, flatColKey);
                }
            }
        }
    }
}

Sorting

Sort colKeys and rowKeys respectively

export class Dataset {
    // 排序规则
    sortRules?: SortRules;
    
    colKeys: string[][] = [];
    rowKeys: string[][] = [];
    // 存储下未排序即初始normal下rowKeys和colKeys
    colKeys_normal: string[][] = [];
    rowKeys_normal: string[][] = [];
    
    setRecords(records: any[] | Record<string, any[]>) {
        this.processRecords(); // 这里收集了维度成员
        
        ...
        this.rowKeys_normal = this.rowKeys.slice();
        this.colKeys_normal = this.colKeys.slice();
        
        this.sortKeys();
    }
    
    // 根据排序规则 对维度keys排序
    sortKeys() {
        this.colKeys = this.colKeys_normal.slice();
        this.rowKeys = this.rowKeys_normal.slice();
        if (!this.sorted) {
            this.sorted = true;
            
            // 排序
            this.rowKeys.sort(this.arrSort(this.rows, true));
            const sortfun = this.arrSort(this.columns, false);
            this.colKeys.sort(sortfun);
        }
    }
    
    // 综合配置的多条排序规则，生成生成排序函数
    arrSort(fieldArr: string[], isRow: boolean) {
        ...
    }
}

Data Parsing Process Final Version

The core is the design of the dimension tree object and the Aggregator class

This document was revised and organized by the following personnel

玄魂

Overview

Automatic Organization of Dimension Tree

Background of the Requirement

Implementation Approach

Analysis

Approach

Data Parsing Process

Traverse

Search

Source Code

Traverse

Assemble to generate

Search

Data Update Status

Data Analysis

Background of the Requirement

Requirements Analysis

Source Code & Implementation

Filter

Derived Fields

Aggregation

Aggregator Class

SumAggregator

Calculated Fields

Summary

Sorting

Data Parsing Process Final Version

Related Materials

This document was revised and organized by the following personnel