分类目录归档:PostgreSQL恢复

记录一次国产数据库被rm -rf /*删除的救援过程

有一套国产数据库运行在arm平台的Kylin Linux上

NAME="Kylin Linux Advanced Server"
VERSION="V10 (Lance)"
ID="kylin"
VERSION_ID="V10"
PRETTY_NAME="Kylin Linux Advanced Server V10 (Lance)"
ANSI_COLOR="0;31"

由于数据库归档日志导致磁盘空间满,在清理归档日志的时候不慎执行了rm -rf /*命令
rm


导致系统无法正常启动,客户尝试通过把磁盘挂载到其他机器上启动,但是无法恢复需要数据.由于客户这个是ssd盘的虚拟化环境,担心可能触发trim和进一步破坏的风险,建议客户尽快把虚拟磁盘下载到本地,然后通过工具分析,发现大量数据已经被删除,特别是数据库目录只剩下部分文件
1

通过分析data/global/1262文件(pg_database)的出来数据库名字对应关系,然后根据应用提供的需要恢复的数据库名称.在data目录中尝试恢复该目录中oid文件,发现最核心的几个字典文件全部异常(元数据异常)
3

pg_class	1259	存储数据库中所有关系(表、索引、视图、序列等)的元数据
pg_attribute	1249	存储表的列(属性)信息,每个字段对应一行
pg_namespace	2615	存储模式(Schema)信息
pg_type	        1247	存储数据类型信息(内置类型、自定义类型)

对于这种问题,普通的恢复工具无法解决,协调专业的文件系统恢复专家对其进行处理,比较完美的恢复出来的该目录下面绝大部分文件(包括字典文件)
4


有了比较完美的恢复出来了需要的数据库文件,然后pdu专家进行该国产库的块结构分析和调整程序并恢复出来数据
5
6

完成数据恢复之后,把数据提供给客户,由客户那边安排厂商进行后续应用层面工作(比如和历史数据整合,应用调试等),总体算比较完美帮忙客户完成了本次rm -rf /* 故障救援

发表在 Linux, PostgreSQL恢复 | 标签为 , , , | 评论关闭

PostgreSQL oid文件替换实现数据访问

对于postgresql数据库,我们都知道他的数据在独立的oid文件里面,而字典主要在pg_type/1247,pg_class/1259,pg_attribute/1260等对象中,而数据是存储在独立的oid和toast oid文件中.如果通过提供oid文件,可以实现狸猫换太子的恢复数据.这里做一个简单测试,大概思路是这样的库里面有一个表a,然后创建一个空表b,然后把a表相关的oid文件复制成b表的oid文件,然后验证数据是否可以正常查询
创建一个空的b表,结构和a表一致

[postgres@xifenfei 13676]$ psql
psql (12.8)
Type "help" for help.

postgres=# CREATE TABLE "public"."AO_6384AB_DISCOVERED1" (
postgres(#   "DATE" "pg_catalog"."timestamp",
postgres(#   "ID" "pg_catalog"."int4",
postgres(#   "KEY" "pg_catalog"."varchar" COLLATE "pg_catalog"."default",
postgres(#   "PLUGIN_KEY" "pg_catalog"."varchar" COLLATE "pg_catalog"."default",
postgres(#   "USER_KEY" "pg_catalog"."varchar" COLLATE "pg_catalog"."default"
postgres(# )
postgres-# ;
CREATE TABLE

检查a/b的情况

postgres=# select count(1) from "AO_6384AB_DISCOVERED";
 count 
-------
   544
(1 row)

postgres=# select count(1) from "AO_6384AB_DISCOVERED1";
 count 
-------
     0
(1 row)

postgres=# \d "AO_6384AB_DISCOVERED"
                    Table "public.AO_6384AB_DISCOVERED"
   Column   |            Type             | Collation | Nullable | Default 
------------+-----------------------------+-----------+----------+---------
 DATE       | timestamp without time zone |           |          | 
 ID         | integer                     |           |          | 
 KEY        | character varying           |           |          | 
 PLUGIN_KEY | character varying           |           |          | 
 USER_KEY   | character varying           |           |          | 

postgres=# \d "AO_6384AB_DISCOVERED1"
                   Table "public.AO_6384AB_DISCOVERED1"
   Column   |            Type             | Collation | Nullable | Default 
------------+-----------------------------+-----------+----------+---------
 DATE       | timestamp without time zone |           |          | 
 ID         | integer                     |           |          | 
 KEY        | character varying           |           |          | 
 PLUGIN_KEY | character varying           |           |          | 
 USER_KEY   | character varying           |           |          | 

存在数据的a/b表相关oid信息



postgres=# select oid,relname,relfilenode from pg_class where relname like '%AO_6384AB_DISCOVERED%';
  oid  |        relname        | relfilenode 
-------+-----------------------+-------------
 16516 | AO_6384AB_DISCOVERED  |       16516
 17069 | AO_6384AB_DISCOVERED1 |       17069
(2 rows)

postgres=# select oid,relname,relfilenode from pg_class where relname like '%16516%' or  relname like '%17069%'  ;
  oid  |       relname        | relfilenode 
-------+----------------------+-------------
 16519 | pg_toast_16516       |       16519
 16521 | pg_toast_16516_index |       16521
 17072 | pg_toast_17069       |       17072
 17074 | pg_toast_17069_index |       17074
(4 rows)

关闭数据库把a表相关的oid文件拷贝替换b表的oid文件

[postgres@xifenfei 13676]$ pg_ctl stop -D /pgdata
waiting for server to shut down....2025-12-12 20:00:58.804 HKT [21445] LOG:  received fast shutdown request
2025-12-12 20:00:58.805 HKT [21445] LOG:  aborting any active transactions
2025-12-12 20:00:58.805 HKT [21471] FATAL:  terminating connection due to administrator command
2025-12-12 20:00:58.805 HKT [21473] FATAL:  terminating connection due to administrator command
2025-12-12 20:00:58.805 HKT [21856] FATAL:  terminating connection due to administrator command
2025-12-12 20:00:58.806 HKT [21472] FATAL:  terminating connection due to administrator command
2025-12-12 20:00:58.806 HKT [21445] LOG: background worker"logical replication launcher"(PID 252)exited with exit code 1
2025-12-12 20:00:58.807 HKT [21447] LOG:  shutting down
2025-12-12 20:00:58.811 HKT [21445] LOG:  database system is shut down
 done
server stopped
[postgres@xifenfei 13676]$ cp  16516 17069
[postgres@xifenfei 13676]$ cp  16519 17072
[postgres@xifenfei 13676]$ cp  16521 17074
[postgres@xifenfei 13676]$ pg_ctl start -D /pgdata
waiting for server to start....2025-12-12 20:02:05.985 HKT [22241] LOG:  
starting PostgreSQL 12.8 on x86_64-pc-linux-gnu, 
compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-26.0.1), 64-bit
2025-12-12 20:02:05.985 HKT [22241] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2025-12-12 20:02:05.986 HKT [22241] LOG:  listening on IPv6 address "::", port 5432
2025-12-12 20:02:05.986 HKT [22241] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2025-12-12 20:02:05.993 HKT [22242] LOG:  database system was shut down at 2025-12-12 20:00:58 HKT
2025-12-12 20:02:05.995 HKT [22241] LOG:  database system is ready to accept connections
 done
server started

查询a/b表数据

[postgres@xifenfei 13676]$ psql
psql (12.8)
Type "help" for help.

postgres=# select count(1) from "AO_6384AB_DISCOVERED1";
 count 
-------
   544
(1 row)

postgres=# select count(1) from "AO_6384AB_DISCOVERED";
 count 
-------
   544
(1 row)

通过上述从测试,证明创建一个新的表结构,完全可以访问老的oid文件中的数据

发表在 PostgreSQL, PostgreSQL恢复 | 标签为 , , , | 评论关闭

Postgres数据库truncate表无有效备份恢复

创建一个Postgres表,并插入数据

postgres=# CREATE TABLE "PeisInterfaceLog"(
postgres(#         "PeisInterfaceLogId" text,
postgres(#         "PeisDepartmentId" text,
postgres(#         "PeisDepartmentName" text,
postgres(#         "PeisInterfaceSubjectType" text,
postgres(#         "PeisInterfaceSubjectId" text,
postgres(#         "PeisInterfaceNo" text,
postgres(#         "PeisInterfaceName" text,
postgres(#         "PeisInterfaceDirection" text,
postgres(#         "PeisInterfaceCallAddress" text,
postgres(#         "PeisInterfaceLogStartTime" timestamp,
postgres(#         "PeisInterfaceInputContent" json,
postgres(#         "PeisInterfaceInputTranscodeContent" json,
postgres(#         "PeisInterfaceOutContent" json,
postgres(#         "PeisInterfaceOutTranscodeContent" json,
postgres(#         "PeisInterfaceSuccessSign" int4,
postgres(#         "PeisInterfaceLogStatusCode" text,
postgres(#         "PeisInterfaceLogNote" text,
postgres(#         "PeisInterfaceLogTimestamp" timestamp,
postgres(#         "PeisInterfaceLogInfo" text,
postgres(#         "PeisPatientRegisterId" text
postgres(# );
CREATE TABLE
postgres=# \i /postgres/COPY/public_copy.sql 
SET
COPY 722957
postgres=# select count(1) from "PeisInterfaceLog";
 count  
--------
 722957
(1 row)

验证truncate操作,引起Postgres中oid的变化

postgres=# checkpoint;
CHECKPOINT
postgres=#  show data_directory;
 data_directory 
----------------
 /pgdata
(1 row)
postgres=# select oid, datname from pg_database ;
  oid  |  datname  
-------+-----------
 13676 | postgres
     1 | template1
 13675 | template0
(3 rows)
postgres=# select relname, relowner, relfilenode from pg_class where relowner = 10 and relname like '%PeisInterfaceLog%';
     relname      | relowner | relfilenode 
------------------+----------+-------------
 PeisInterfaceLog |       10 |       16384
(1 row)

postgres=# truncate table "PeisInterfaceLog";
TRUNCATE TABLE
postgres=# select count(1) from  "PeisInterfaceLog";
 count 
-------
     0
(1 row)

postgres=# select relname, relowner, relfilenode from pg_class where relowner = 10 and relname like '%PeisInterfaceLog%';
     relname      | relowner | relfilenode 
------------------+----------+-------------
 PeisInterfaceLog |       10 |       16394
(1 row)

使用工具进行初始化字典信息

PDU.public=# b;

开始初始化...
 -pg_database:</pgdata/global/1262>

数据库:postgres 
      -pg_schema:</pgdata/base/13676/2615>
      -pg_class:</pgdata/base/13676/1259> 共88行
      -pg_attribute:</pgdata/base/13676/1249> 共2950行
      模式:
        ▌ public 1张表

PDU.public=# use postgres;

┌────────────────────────────────────────┐
│          模式             │  表数量    │
├────────────────────────────────────────┤
│    public                 │  1         │
└────────────────────────────────────────┘
postgres.public=# set public;

┌──────────────────────────────────────────────────┐
│               表名                  │  表大小    │
├──────────────────────────────────────────────────┤
│    PeisInterfaceLog                 │  0         │
└──────────────────────────────────────────────────┘
        仅显示表大小排名前 1 的表名
postgres.public=# \d PeisInterfaceLog;

┌──────────────────────────────────────────────────────────────┐
│                            列类型                            │
└──────────────────────────────────────────────────────────────┘
text,text,text,text,text,text,text,text,text,timestamp,json,json,json,json,int4,text,text,timestamp,text,text

配置软件磁盘扫描操作(pdu.ini中配置)

#dropScan需要扫描的磁盘
DISK_PATH=/data/test.dd
#dropScan时跳跃的数据块数量,数值越小覆盖磁盘越全面,速度越慢
BLOCK_INTERVAL=5

启用磁盘扫描操作

PDU.public=# p idxmode off;

┌─────────────────────────────────────────────────────────────────┐
│              参数                │             当前值           │
├─────────────────────────────────────────────────────────────────┤
│    startwal                      │                              │
│    endwal                        │                              │
│    startlsnt                     │                              │
│    endlsnt                       │                              │
│    starttime                     │                              │
│    endtime                       │                              │
│    resmode(Data Restore Mode)    │              TIME            │
│    exmode(Data Export Mode)      │              CSV             │
│    encoding                      │              UTF8            │
│    restype(Data Restore Type)    │              DELETE          │
          ----------------------DropScan----------------------
│    dsoff(DropScan startOffset)   │              0               │
│    blkiter(Block Intervals)      │              5               │
│    itmpcsv(Items Per Csv)        │              100             │
│    idxmode                       │              off             │
└─────────────────────────────────────────────────────────────────┘
PDU.public=# ds;

 ▌全量扫描恢复模式 

 ▌数据文件扫描 
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
          表名           │                                   结果                                    
├────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
 PeisInterfaceLog           99.976 %(21469593600)   数据页: 71158      成功: 722947    (疑似乱码: 2809      ) 失败: 0 

耗时 15.28 秒
└────────────────────────────────────────────────────────────────────────────────────────────────────┘
扫描完毕,文件目录如下: 
        restore/dropscan/PeisInterfaceLog
PDU.public=# ds copy;
已导出: 
/restore/dropscan/PeisInterfaceLog/COPY.sql
PDU.public=# 

[root@xifenfei PeisInterfaceLog]# more COPY.sql 
[root@xifenfei PeisInterfaceLog]# cat /restore/dropscan/PeisInterfaceLog/COPY.sql
COPY PeisInterfaceLog FROM '/restore/dropscan/PeisInterfaceLog/09-28-21:29:55_226738176_32760blks_336787items.csv';
COPY PeisInterfaceLog FROM '/restore/dropscan/PeisInterfaceLog/09-28-21:30:01_595968000_32767blks_330416items.csv';
COPY PeisInterfaceLog FROM '/restore/dropscan/PeisInterfaceLog/09-28-21:30:04_1116061696_5631blks_55744items.csv';

把恢复数据导入到Postgres数据中

[root@xifenfei ~]# su - postgres 
[postgres@xifenfei ~]$ psql
psql (12.8)
Type "help" for help.

postgres=# \i /restore/dropscan/PeisInterfaceLog/COPY.sql
COPY 336787
COPY 330416
COPY 55744
postgres=# select count(1) from "PeisInterfaceLog";
 count  
--------
 722947
(1 row)

经过上述扫描测试证明该工具实现了在Postgres中truncate数据的绝大部分数据恢复(这里的乱码不是由于没有扫描到数据,主要是由于个别字符串由于类型判断关系识别不对导致乱码抛弃).
如果你遇到Postgres 数据库由于drop/truncate等误操作,而且无有效备份进行恢复,面临数据丢失风险,请第一时间保护现场(数据文件所在分区尽可能不要有写入操作),联系我们提供专业恢复技术支持:
电话/微信:17813235971    Q Q:107644445QQ咨询惜分飞    E-Mail:dba@xifenfei.com

发表在 pdu工具, PostgreSQL恢复 | 标签为 , | 评论关闭