Preliminary Preparations

Precautions

  • Data must be masked before transmission to ensure it doesn’t include personal information.
  • The table data backuped is stored as in its original form.
    • While streaming Fluentd and client log transmissions perform basic parsing for the data (such as IP), log batch mode doesn’t perform data parsing.
  • Log data must have dateTime and category columns.
  • Log batch transmission is for uploading large amounts of data at once. Thus, multiple files divided on a row basis should be combined into one file for upload, and real-time data should be transmitted using client log transmission or streaming Fluentd.

 

How to define Log

Using the Define Logs Page

  • The log schema should be specified in advance through the log definition.
  • For detailed information on log definition, please refer to Define Logs.
    • The table name set in log definition must be the same as the category value.
    • If you don’t define logs and proceed with log batch transmission, the data will not be stored.
  • Log batch uses dateTime and category as mandatory columns. If not transmitted, the data will not be stored.
  • During log definition, reserved fields used in analytics may be included.

 

Mandatory Columns

Batch file logs have two essential columns.

Column Name

Data Type

Description

Sample

dateTime TIMESTAMP The time of extraction or upload of the log batch

– Excludes timezone (KST)

“YYYY-MM-DD hh:mm:ss”
category STRING Category ID set in the log collection back office “account_table_snapshot”
  • If you have been using “dateTime” as a column name in log batch, it’s recommended to change it to a different column name before transmission.
  • dateTime is used as the data for partitioning so it must be written as the time of batch file log transmission or the time of upload.

 

Applying for Permissions

Applying for BigQuery Permissions

  • The permission for connecting to BigQuery and the data querying permission for viewing data.
  • When applying for the permissions for BigQuery, the permissions for GCS upload are also granted.
  • You can apply for BigQuery access through the Hive Console > Analytics > Log Definitions > Access BigQuery menu.
  • For more details, please refer to the Permission Application Guide. Check Permission Application Guide