Skip to Content.
Sympa Menu

en - Extraction script

Subject: The mailing list for listmasters using Sympa

List archive

Chronological Thread  
  • From: Michelle Murrain <address@concealed>
  • To: address@concealed
  • Cc: address@concealed
  • Subject: Extraction script
  • Date: Tue, 06 Jul 2004 19:08:39 -0400

Hi there,

Apropo of the conversation a couple of weeks ago or so, and to solve a problem, I whipped up this CLI script today which extracts data from sympa, and either prints a list of email addresses to STDOUT, outputs a comma-delimited file with the fields you want, or maps the data to a database of your choice (postgres or mysql), and INSERTS lines. (no update yet)

The code is only half-decent, likely has at least 45 bugs, and is only modestly PODded. So use at your own risk. But it works. Improvements, changes, ideas are most welcome.
--
.Michelle

--------------------------
Michelle Murrain
mmurrain at dbdes dot com
413-222-6350 ph
617-889-0929 ph
952-674-7253 fax <--- new
Page: address@concealed
AIM:pearlbear0 ICQ:129250575
Y!: michelle_murrain Jabber: address@concealed

"Work like you don't need the money. Love like you've never been hurt. Dance like nobody's watching." - Satchel Paige
#!/usr/bin/perl

=head1 NAME

sympa_extract.pl

=head1 SYNOPSIS

sympa_extract.pl is a CLI perl script that will extract subscriber data from
a sympa database, and either print a list of email addresses to STDOUT,
output a comma delimited file with specified fields, or drop into another
database.

=head1 USAGE

%: perl sypma_extract.pl database_name host port username

=head1 EXAMPLE

%host: perl sympa_extract.pl sympa foo.com 5432 sympa

%host: perl sympa_extract.pl
(This assumes database sympa, localhost, port 5432 and usernam sympa)

=head2 DESCRIPTION

This script is designed to easily extract sympa subscriber data, and output a
list of email addresses, export to a comma delimited file, or put out to a
database.

You

When outputting to a file, choose the fields from sympa you want to output. A
list of available fields is printed.

When outputting to a database, you need to specify db type, database, table,
host, port, username and password. Also, you need to map fields in the
database which match the fields exported from sympa. The script will prompt
you with field names from the database, and you fill the sympa field you want
to be filled with sympa data.

Example:

Available columns from sympa:
gecos_user,cookie_delay_user,password_user,lang_user,attributes_user,list_subscriber,user_subscriber,date_subscriber,update_subscriber,visibility_subscriber,bounce_subscriber,bounce_score_subscriber,comment_subscriber,subscribed_subscriber,
included_subscriber, included_sources_subscribe
test_table_id:
test1:gecos_user
test2:user_subscriber
test3:list_subscriber
test_client:

=back

=cut

# Program to extract subscriber data from sympa lists

use DBI;
use strict;
use warnings;
use Term::ReadKey;

print "Sympa Extraction Tool!\n";
print "Sypma DB - pg/mysql [pg]: ";
my $db_type = <>;
chomp ($db_type); if (!$db_type) {$db_type = "pg";}

my ($dbname,$host,$port,$username) = @ARGV;
if (!$dbname) {$dbname='sympa';}
if (!$host) {$host='localhost';}
if (!$port) {$port='5432';}
if (!$username) {$username='sympa';}

my $password;
print "Password: ";
ReadMode 2;
$password = <>;
ReadMode 0;
print "\n";
chomp($password);

my $data_source;
if ($db_type eq "pg") {$data_source =
"dbi:Pg:dbname=$dbname;host=$host;port=$port";}
else {$data_source = "DBI:mysql:database=$dbname;host=$host;port=$port";}

my $dbh = DBI->connect($data_source, $username,$password) or die "Can't
connect to sympa database!";

# First, let's grab all of the list names

my $sql = "SELECT DISTINCT ON (list_subscriber) list_subscriber from
subscriber_table";
my $sth = $dbh->prepare($sql);
$sth->execute();

my $val; my @list;
$sth->bind_col(1, \$val);
while ($sth->fetch())
{
push @list, $val;
}

$sth->finish;

print "Lists available in this database:";

foreach (@list) {print "$_\n";}

print "List to extract [all]: ";

my $list = <>;
chomp ($list);
if (!$list) { $list = "all";}

print "Note: print will just list email addresses of subscribers\n";
print "print/export to CSV/db [print]: ";
my $output = <>;
chomp ($output);
my $file = ""; my $fields;
my (%new_data,$table,$dbh1);

if ($output =~ /export/i) {
print "File name [sympa_$list.csv]: ";
$file = <>;
chomp ($file);
if (!$file) {$file = "sympa_$list.csv";}
print "Columns to include - comma separated list
(gecos_user,cookie_delay_user,password_user,lang_user,attributes_user,list_subscriber,user_subscriber,date_subscriber,update_subscriber,visibility_subscriber,bounce_subscriber,bounce_score_subscriber,comment_subscriber,subscribed_subscriber,
included_subscriber, included_sources_subscriber) [all]: ";
$fields = <>;
chomp ($fields);
} elsif ($output =~ /db/i) { # output to a db
# we need database info, then we need to map it to what sympa has
print "Destination DB - pg/mysql [pg]: ";
my $db_type = <>;
chomp($db_type);
if (!$db_type) {$db_type = "pg";}
print "database,table,host,port,username:";
my $db_info = <>;
chomp($db_info);
my $password;
print "Password: ";
ReadMode 2;
$password = <>;
ReadMode 0;
print "\n";
chomp($password);
my ($dbname,$host,$port,$username);
($dbname,$table,$host,$port,$username) = split(',',$db_info);
my $data_source;
if ($db_type eq "pg") {$data_source =
"dbi:Pg:dbname=$dbname;host=$host;port=$port";}
else {$data_source = "DBI:mysql:database=$dbname;host=$host;port=$port";}

$dbh1 = DBI->connect($data_source, $username,$password) or die "Can't
connect to other database!";

my $sql = "SELECT * from $table";
my $sth1 = $dbh1->prepare($sql);
$sth1->execute();

my @column;

my $numfields = $sth1->{NUM_OF_FIELDS};
for (my $i=0; $i<$numfields; $i++) {
$column[$i] = $sth1 -> {NAME} -> [$i];
}

print "Available columns from sympa:
gecos_user,cookie_delay_user,password_user,lang_user,attributes_user,list_subscriber,user_subscriber,date_subscriber,update_subscriber,visibility_subscriber,bounce_subscriber,bounce_score_subscriber,comment_subscriber,subscribed_subscriber,
included_subscriber, included_sources_subscribe\n";

foreach my $column (@column) {
print "$column:";
my $item = <>;
chomp($item);
if ($item) {$new_data{$column} = $item;}
}
$sth1->finish;
$fields = join(',',values %new_data);

} else {
$fields = "user_subscriber";
}

if (!$fields) {$fields = "*";}
print "Processing ....\n";

# Ok, now get that data

$sql = "SELECT $fields FROM subscriber_table,user_table WHERE
email_user=user_subscriber";
if ($list ne "all") {$sql .= " AND list_subscriber like '$list'";}

print "SQL:$sql\n";

$sth = $dbh->prepare($sql);
$sth->execute();

my (@column,@data);

my $numfields = $sth->{NUM_OF_FIELDS};
for (my $i=0; $i<$numfields; $i++) {
$column[$i] = $sth -> {NAME} -> [$i];
}

while (my $hash_ref = $sth->fetchrow_hashref) {
my $rec;
foreach (@column) {
$rec->{$_}=$hash_ref->{$_};
}
push @data, $rec;
}

if (!$output) {
# spit out email addresses
foreach my $record (@data) {print "$record->{user_subscriber}\n";}
} elsif ($output =~ /db/i) { # db spit

my @fields = keys %new_data;
my $fields = join(',',@fields);
my $sth1;

foreach my $record (@data) {
# extract the data from that line that will be inserted
my @values = values %$record;
foreach my $value (@values) {$value = $dbh1->quote($value);}
my $values = join(',',@values);
# make the SQL statement
my $sql = "INSERT INTO $table ($fields) VALUES ($values)";
# print "SQL:$sql\n";

$sth1 = $dbh1->prepare($sql);
$sth1->execute();
}
$sth1->finish;

} else { # make a file
open (OUTPUT,">$file") or die "Can't open $file for output!\n";

my @output;
push @output, join(',',@column);
foreach my $record (@data) {
my $line;
foreach (keys %$record) {
if ($record->{$_}) {$line .= "$record->{$_},";}
}
chop($line);
push @output, $line;
}

foreach my $line (@output) {
print OUTPUT $line;
print OUTPUT "\n";
}

close OUTPUT;
}


$sth->finish;
$dbh->disconnect;

print "Done.";
exit;



Archive powered by MHonArc 2.6.19+.

Top of Page