[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-pipeline-data-throughput-tinggi-batch-insert-copy-resolusi-konflik":3},{"article":4,"author":58},{"id":5,"category_id":6,"title":7,"slug":8,"excerpt":9,"content_md":10,"content_html":11,"locale":12,"author_id":13,"published":14,"published_at":15,"meta_title":16,"meta_description":17,"focus_keyword":18,"og_image":19,"canonical_url":19,"robots_meta":20,"created_at":15,"updated_at":15,"tags":21,"category_name":24,"related_articles":39},"d2000000-0000-0000-0000-000000000128","a0000000-0000-0000-0000-000000000025","Pipeline Data Throughput Tinggi — Batch Insert, COPY, dan Resolusi Konflik","pipeline-data-throughput-tinggi-batch-insert-copy-resolusi-konflik","Membangun pipeline data berkecepatan tinggi dengan PostgreSQL: batch insert, COPY protocol, ON CONFLICT handling, dan teknik buffering untuk throughput maksimal.","## Tantangan: Jutaan Insert per Jam\n\nKetika aplikasi Anda perlu memasukkan ratusan ribu atau jutaan baris per jam — event logging, data IoT, atau indeksasi blockchain — insert satu per satu tidak akan cukup. Anda memerlukan teknik bulk loading.\n\n## Perbandingan Metode Insert\n\n| Metode | Throughput | Latensi | Kompleksitas |\n|--------|-----------|---------|-------------|\n| Single INSERT | ~500\u002Fdetik | Rendah | Rendah |\n| Batch INSERT | ~10.000\u002Fdetik | Sedang | Sedang |\n| COPY | ~100.000\u002Fdetik | Tinggi | Sedang |\n| COPY BINARY | ~150.000\u002Fdetik | Tinggi | Tinggi |\n\n## Batch INSERT\n\nGabungkan beberapa INSERT menjadi satu statement:\n\n```rust\nasync fn batch_insert(\n    pool: &PgPool,\n    events: &[Event],\n) -> Result\u003C(), DbError> {\n    if events.is_empty() { return Ok(()); }\n    \n    let mut query = String::from(\n        \"INSERT INTO events (id, event_type, payload, created_at) VALUES \"\n    );\n    \n    let mut params: Vec\u003CBox\u003Cdyn sqlx::Encode\u003CPostgres> + Send>> = Vec::new();\n    \n    for (i, event) in events.iter().enumerate() {\n        if i > 0 { query.push_str(\", \"); }\n        let base = i * 4;\n        query.push_str(&format!(\n            \"(${}, ${}, ${}, ${})\",\n            base + 1, base + 2, base + 3, base + 4\n        ));\n    }\n    \n    query.push_str(\" ON CONFLICT (id) DO NOTHING\");\n    \n    let mut q = sqlx::query(&query);\n    for event in events {\n        q = q.bind(&event.id)\n             .bind(&event.event_type)\n             .bind(&event.payload)\n             .bind(&event.created_at);\n    }\n    \n    q.execute(pool).await?;\n    Ok(())\n}\n```\n\nPerformance tip: batch size 100-1000 memberikan throughput optimal. Lebih dari 1000 dan overhead parsing SQL membatalkan keuntungan.\n\n## COPY Protocol\n\nCOPY adalah protokol PostgreSQL yang dioptimasi untuk bulk loading. Jauh lebih cepat dari INSERT karena melewati SQL parser:\n\n```rust\nuse tokio_postgres::CopyInSink;\nuse bytes::BytesMut;\n\nasync fn copy_insert(\n    client: &tokio_postgres::Client,\n    events: &[Event],\n) -> Result\u003Cu64, DbError> {\n    let sink = client\n        .copy_in(\"COPY events (id, event_type, payload, created_at) FROM STDIN WITH (FORMAT csv)\")\n        .await?;\n    \n    let writer = BufWriter::new(sink);\n    let mut csv_writer = csv::Writer::from_writer(writer);\n    \n    for event in events {\n        csv_writer.write_record(&[\n            event.id.to_string(),\n            event.event_type.clone(),\n            serde_json::to_string(&event.payload)?,\n            event.created_at.to_rfc3339(),\n        ])?;\n    }\n    \n    csv_writer.flush()?;\n    let rows = csv_writer.into_inner()?.finish().await?;\n    Ok(rows)\n}\n```\n\n## ON CONFLICT: Upsert Pattern\n\n```sql\n-- Insert or update\nINSERT INTO metrics (sensor_id, timestamp, value)\nVALUES ($1, $2, $3)\nON CONFLICT (sensor_id, timestamp) DO UPDATE SET\n    value = EXCLUDED.value,\n    updated_at = NOW();\n\n-- Insert or skip\nINSERT INTO events (id, type, data)\nVALUES ($1, $2, $3)\nON CONFLICT (id) DO NOTHING;\n```\n\n## Buffering Pattern di Rust\n\n```rust\nstruct InsertBuffer {\n    buffer: Vec\u003CEvent>,\n    capacity: usize,\n    pool: PgPool,\n    flush_interval: Duration,\n}\n\nimpl InsertBuffer {\n    fn new(pool: PgPool, capacity: usize, flush_interval: Duration) -> Self {\n        Self {\n            buffer: Vec::with_capacity(capacity),\n            capacity,\n            pool,\n            flush_interval,\n        }\n    }\n    \n    async fn add(&mut self, event: Event) -> Result\u003C(), DbError> {\n        self.buffer.push(event);\n        if self.buffer.len() >= self.capacity {\n            self.flush().await?;\n        }\n        Ok(())\n    }\n    \n    async fn flush(&mut self) -> Result\u003C(), DbError> {\n        if self.buffer.is_empty() { return Ok(()); }\n        let events: Vec\u003CEvent> = self.buffer.drain(..).collect();\n        batch_insert(&self.pool, &events).await\n    }\n}\n\n\u002F\u002F Dengan timer otomatis\nasync fn run_buffer(mut buffer: InsertBuffer, mut rx: mpsc::Receiver\u003CEvent>) {\n    let mut interval = tokio::time::interval(buffer.flush_interval);\n    \n    loop {\n        tokio::select! {\n            Some(event) = rx.recv() => {\n                buffer.add(event).await.unwrap();\n            }\n            _ = interval.tick() => {\n                buffer.flush().await.unwrap();\n            }\n        }\n    }\n}\n```\n\n## Optimasi Tambahan\n\n### 1. Disable Index Sementara\n```sql\n-- Untuk bulk load besar\nALTER INDEX idx_events_created_at SET (fastupdate = off);\n-- Load data...\nALTER INDEX idx_events_created_at SET (fastupdate = on);\nREINDEX INDEX idx_events_created_at;\n```\n\n### 2. Unlogged Table\n```sql\n-- Tidak menulis WAL — 2-3x lebih cepat, tetapi tidak crash-safe\nCREATE UNLOGGED TABLE staging_events (LIKE events);\n-- Load ke staging, lalu INSERT INTO events SELECT FROM staging_events;\n```\n\n### 3. Tuning WAL\n```ini\n# postgresql.conf untuk bulk loading\nwal_level = minimal\nmax_wal_senders = 0\nfsync = off  # PERINGATAN: risiko data loss pada crash!\n```\n\n## Monitoring Throughput\n\n```sql\n-- Rows inserted per detik\nSELECT\n    relname,\n    n_tup_ins AS total_inserts,\n    n_tup_ins - lag(n_tup_ins) OVER (ORDER BY now()) AS inserts_per_interval\nFROM pg_stat_user_tables\nWHERE relname = 'events';\n```\n\n## Kesimpulan\n\nPipeline data throughput tinggi memerlukan batch insert atau COPY untuk menghindari overhead per-row. Buffering di aplikasi mengumpulkan event sebelum flush, ON CONFLICT menangani duplikat, dan tuning PostgreSQL (WAL, index, unlogged table) memberikan throughput maksimal. Untuk kasus ekstrem, COPY binary memberikan throughput tertinggi dengan overhead parsing minimal.","\u003Ch2 id=\"tantangan-jutaan-insert-per-jam\">Tantangan: Jutaan Insert per Jam\u003C\u002Fh2>\n\u003Cp>Ketika aplikasi Anda perlu memasukkan ratusan ribu atau jutaan baris per jam — event logging, data IoT, atau indeksasi blockchain — insert satu per satu tidak akan cukup. Anda memerlukan teknik bulk loading.\u003C\u002Fp>\n\u003Ch2 id=\"perbandingan-metode-insert\">Perbandingan Metode Insert\u003C\u002Fh2>\n\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>Metode\u003C\u002Fth>\u003Cth>Throughput\u003C\u002Fth>\u003Cth>Latensi\u003C\u002Fth>\u003Cth>Kompleksitas\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\n\u003Ctr>\u003Ctd>Single INSERT\u003C\u002Ftd>\u003Ctd>~500\u002Fdetik\u003C\u002Ftd>\u003Ctd>Rendah\u003C\u002Ftd>\u003Ctd>Rendah\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Batch INSERT\u003C\u002Ftd>\u003Ctd>~10.000\u002Fdetik\u003C\u002Ftd>\u003Ctd>Sedang\u003C\u002Ftd>\u003Ctd>Sedang\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>COPY\u003C\u002Ftd>\u003Ctd>~100.000\u002Fdetik\u003C\u002Ftd>\u003Ctd>Tinggi\u003C\u002Ftd>\u003Ctd>Sedang\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>COPY BINARY\u003C\u002Ftd>\u003Ctd>~150.000\u002Fdetik\u003C\u002Ftd>\u003Ctd>Tinggi\u003C\u002Ftd>\u003Ctd>Tinggi\u003C\u002Ftd>\u003C\u002Ftr>\n\u003C\u002Ftbody>\u003C\u002Ftable>\n\u003Ch2 id=\"batch-insert\">Batch INSERT\u003C\u002Fh2>\n\u003Cp>Gabungkan beberapa INSERT menjadi satu statement:\u003C\u002Fp>\n\u003Cpre>\u003Ccode class=\"language-rust\">async fn batch_insert(\n    pool: &amp;PgPool,\n    events: &amp;[Event],\n) -&gt; Result&lt;(), DbError&gt; {\n    if events.is_empty() { return Ok(()); }\n    \n    let mut query = String::from(\n        \"INSERT INTO events (id, event_type, payload, created_at) VALUES \"\n    );\n    \n    let mut params: Vec&lt;Box&lt;dyn sqlx::Encode&lt;Postgres&gt; + Send&gt;&gt; = Vec::new();\n    \n    for (i, event) in events.iter().enumerate() {\n        if i &gt; 0 { query.push_str(\", \"); }\n        let base = i * 4;\n        query.push_str(&amp;format!(\n            \"(${}, ${}, ${}, ${})\",\n            base + 1, base + 2, base + 3, base + 4\n        ));\n    }\n    \n    query.push_str(\" ON CONFLICT (id) DO NOTHING\");\n    \n    let mut q = sqlx::query(&amp;query);\n    for event in events {\n        q = q.bind(&amp;event.id)\n             .bind(&amp;event.event_type)\n             .bind(&amp;event.payload)\n             .bind(&amp;event.created_at);\n    }\n    \n    q.execute(pool).await?;\n    Ok(())\n}\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Cp>Performance tip: batch size 100-1000 memberikan throughput optimal. Lebih dari 1000 dan overhead parsing SQL membatalkan keuntungan.\u003C\u002Fp>\n\u003Ch2 id=\"copy-protocol\">COPY Protocol\u003C\u002Fh2>\n\u003Cp>COPY adalah protokol PostgreSQL yang dioptimasi untuk bulk loading. Jauh lebih cepat dari INSERT karena melewati SQL parser:\u003C\u002Fp>\n\u003Cpre>\u003Ccode class=\"language-rust\">use tokio_postgres::CopyInSink;\nuse bytes::BytesMut;\n\nasync fn copy_insert(\n    client: &amp;tokio_postgres::Client,\n    events: &amp;[Event],\n) -&gt; Result&lt;u64, DbError&gt; {\n    let sink = client\n        .copy_in(\"COPY events (id, event_type, payload, created_at) FROM STDIN WITH (FORMAT csv)\")\n        .await?;\n    \n    let writer = BufWriter::new(sink);\n    let mut csv_writer = csv::Writer::from_writer(writer);\n    \n    for event in events {\n        csv_writer.write_record(&amp;[\n            event.id.to_string(),\n            event.event_type.clone(),\n            serde_json::to_string(&amp;event.payload)?,\n            event.created_at.to_rfc3339(),\n        ])?;\n    }\n    \n    csv_writer.flush()?;\n    let rows = csv_writer.into_inner()?.finish().await?;\n    Ok(rows)\n}\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ch2 id=\"on-conflict-upsert-pattern\">ON CONFLICT: Upsert Pattern\u003C\u002Fh2>\n\u003Cpre>\u003Ccode class=\"language-sql\">-- Insert or update\nINSERT INTO metrics (sensor_id, timestamp, value)\nVALUES ($1, $2, $3)\nON CONFLICT (sensor_id, timestamp) DO UPDATE SET\n    value = EXCLUDED.value,\n    updated_at = NOW();\n\n-- Insert or skip\nINSERT INTO events (id, type, data)\nVALUES ($1, $2, $3)\nON CONFLICT (id) DO NOTHING;\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ch2 id=\"buffering-pattern-di-rust\">Buffering Pattern di Rust\u003C\u002Fh2>\n\u003Cpre>\u003Ccode class=\"language-rust\">struct InsertBuffer {\n    buffer: Vec&lt;Event&gt;,\n    capacity: usize,\n    pool: PgPool,\n    flush_interval: Duration,\n}\n\nimpl InsertBuffer {\n    fn new(pool: PgPool, capacity: usize, flush_interval: Duration) -&gt; Self {\n        Self {\n            buffer: Vec::with_capacity(capacity),\n            capacity,\n            pool,\n            flush_interval,\n        }\n    }\n    \n    async fn add(&amp;mut self, event: Event) -&gt; Result&lt;(), DbError&gt; {\n        self.buffer.push(event);\n        if self.buffer.len() &gt;= self.capacity {\n            self.flush().await?;\n        }\n        Ok(())\n    }\n    \n    async fn flush(&amp;mut self) -&gt; Result&lt;(), DbError&gt; {\n        if self.buffer.is_empty() { return Ok(()); }\n        let events: Vec&lt;Event&gt; = self.buffer.drain(..).collect();\n        batch_insert(&amp;self.pool, &amp;events).await\n    }\n}\n\n\u002F\u002F Dengan timer otomatis\nasync fn run_buffer(mut buffer: InsertBuffer, mut rx: mpsc::Receiver&lt;Event&gt;) {\n    let mut interval = tokio::time::interval(buffer.flush_interval);\n    \n    loop {\n        tokio::select! {\n            Some(event) = rx.recv() =&gt; {\n                buffer.add(event).await.unwrap();\n            }\n            _ = interval.tick() =&gt; {\n                buffer.flush().await.unwrap();\n            }\n        }\n    }\n}\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ch2 id=\"optimasi-tambahan\">Optimasi Tambahan\u003C\u002Fh2>\n\u003Ch3>1. Disable Index Sementara\u003C\u002Fh3>\n\u003Cpre>\u003Ccode class=\"language-sql\">-- Untuk bulk load besar\nALTER INDEX idx_events_created_at SET (fastupdate = off);\n-- Load data...\nALTER INDEX idx_events_created_at SET (fastupdate = on);\nREINDEX INDEX idx_events_created_at;\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ch3>2. Unlogged Table\u003C\u002Fh3>\n\u003Cpre>\u003Ccode class=\"language-sql\">-- Tidak menulis WAL — 2-3x lebih cepat, tetapi tidak crash-safe\nCREATE UNLOGGED TABLE staging_events (LIKE events);\n-- Load ke staging, lalu INSERT INTO events SELECT FROM staging_events;\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ch3>3. Tuning WAL\u003C\u002Fh3>\n\u003Cpre>\u003Ccode class=\"language-ini\"># postgresql.conf untuk bulk loading\nwal_level = minimal\nmax_wal_senders = 0\nfsync = off  # PERINGATAN: risiko data loss pada crash!\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ch2 id=\"monitoring-throughput\">Monitoring Throughput\u003C\u002Fh2>\n\u003Cpre>\u003Ccode class=\"language-sql\">-- Rows inserted per detik\nSELECT\n    relname,\n    n_tup_ins AS total_inserts,\n    n_tup_ins - lag(n_tup_ins) OVER (ORDER BY now()) AS inserts_per_interval\nFROM pg_stat_user_tables\nWHERE relname = 'events';\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ch2 id=\"kesimpulan\">Kesimpulan\u003C\u002Fh2>\n\u003Cp>Pipeline data throughput tinggi memerlukan batch insert atau COPY untuk menghindari overhead per-row. Buffering di aplikasi mengumpulkan event sebelum flush, ON CONFLICT menangani duplikat, dan tuning PostgreSQL (WAL, index, unlogged table) memberikan throughput maksimal. Untuk kasus ekstrem, COPY binary memberikan throughput tertinggi dengan overhead parsing minimal.\u003C\u002Fp>\n","id","b0000000-0000-0000-0000-000000000001",true,"2026-03-28T10:44:25.173237Z","Pipeline Data Throughput Tinggi — Batch Insert, COPY, Resolusi Konflik","Membangun pipeline data: batch insert, COPY protocol, ON CONFLICT upsert, buffering Rust, dan tuning PostgreSQL untuk throughput maksimal.","pipeline data PostgreSQL",null,"index, follow",[22,27,31,35],{"id":23,"name":24,"slug":25,"created_at":26},"c0000000-0000-0000-0000-000000000012","DevOps","devops","2026-03-28T10:44:21.513630Z",{"id":28,"name":29,"slug":30,"created_at":26},"c0000000-0000-0000-0000-000000000022","Performance","performance",{"id":32,"name":33,"slug":34,"created_at":26},"c0000000-0000-0000-0000-000000000005","PostgreSQL","postgresql",{"id":36,"name":37,"slug":38,"created_at":26},"c0000000-0000-0000-0000-000000000001","Rust","rust",[40,46,52],{"id":41,"title":42,"slug":43,"excerpt":44,"locale":12,"category_name":24,"published_at":45},"d0000000-0000-0000-0000-000000000644","Platform Engineering Memakan DevOps: Membangun Internal Developer Platform di 2026","platform-engineering-memakan-devops-membangun-idp-2026","80% organisasi engineering besar kini memiliki tim platform khusus, naik dari 45% di 2024. Internal developer platform — portal self-service, infrastruktur yang sudah disetujui, guardrail otomatis — telah menjadi cara standar untuk menghadirkan DevOps secara besar-besaran.","2026-03-28T10:44:47.476351Z",{"id":47,"title":48,"slug":49,"excerpt":50,"locale":12,"category_name":24,"published_at":51},"d0000000-0000-0000-0000-000000000643","Observabilitas Tanpa Instrumentasi: Bagaimana eBPF Menggantikan Armada Sidecar","observabilitas-tanpa-instrumentasi-ebpf-menggantikan-armada-sidecar","67% tim Kubernetes kini menggunakan alat observabilitas berbasis eBPF, naik dari 29% di 2024. Dengan memindahkan pengumpulan telemetri ke kernel, eBPF menghilangkan kontainer sidecar, memangkas penggunaan RAM sebesar 84%, dan memberikan overhead CPU di bawah 1%.","2026-03-28T10:44:47.469045Z",{"id":53,"title":54,"slug":55,"excerpt":56,"locale":12,"category_name":24,"published_at":57},"d0000000-0000-0000-0000-000000000642","WASI 0.3 dan Kematian Cold Start: Wasm Sisi Server di Produksi","wasi-0-3-kematian-cold-start-wasm-sisi-server-di-produksi","WASI 0.3 dirilis pada Februari 2026 dengan async I\u002FO native, tipe stream, dan dukungan socket penuh. WebAssembly sisi server kini menghadirkan cold start dalam hitungan mikrodetik, dan setiap penyedia cloud besar menawarkan Wasm serverless.","2026-03-28T10:44:47.445780Z",{"id":13,"name":59,"slug":60,"bio":61,"photo_url":19,"linkedin":19,"role":62,"created_at":63,"updated_at":63},"Open Soft Team","open-soft-team","The engineering team at Open Soft, building premium software solutions from Bali, Indonesia.","Engineering Team","2026-03-28T08:31:22.226811Z"]