MySQL help needed

Cespian

Elite Vaper
LV
16
 
Joined
19/11/15
Posts
1,157
Awards
18
Location
Cape Town
Hey Guys

Hoping there are some SQL Gurus here. I need some assistance (MySQL specifically):

Context; I have 2 tables:

Table 1
--------
Customer_id | Entries
1 | 3
2 | 4
3 | 2


Basically, I need to duplicate the Customer_id in Table 2 for each Entry they have. Hence Table 2 should look like result set below after the insert:

Customer_id
1
1
1
2
2
2
2
3
3


How would I go about scripting this? Unfortunately I do not have the luxury of using another language (Like creating a Loop in PHP would have been perfect) as I have to do this via command line. Considering that there are 300K+ rows in Table1, manually typing INSERT INTO table2 VALUES X entries is impossible. And yes, I am aware that Table2 will most likely exceed a billion rows after the insert is complete (I have over 40Gb available).

Another issue I may have is that I will be doing this on a production server and may lock up Table1, but I will create a backup of the table first and work from the backup.

Thanks in advance
 
You can try to cross join with a tempory numbers table where the number of entries in the numbers table will be the maximum number of entries you have. Example below : I don't how efficient this will be


select customer_id
from (
select *
from
(
select 1 as customer_id, 3 as entries
union
select 2 as customer_id, 4 as entries
union
select 3 as customer_id, 2 as entries
) cust,
(
select 1 as x union
select 2 as x union
select 3 as x union
select 4 as x
) numbers
) tmp where x <= entries
order by customer_id
 
You can try to cross join with a tempory numbers table where the number of entries in the numbers table will be the maximum number of entries you have. Example below : I don't how efficient this will be


select customer_id
from (
select *
from
(
select 1 as customer_id, 3 as entries
union
select 2 as customer_id, 4 as entries
union
select 3 as customer_id, 2 as entries
) cust,
(
select 1 as x union
select 2 as x union
select 3 as x union
select 4 as x
) numbers
) tmp where x <= entries
order by customer_id

Efficiency is the least of my worries at this point, however cross joins are quite intense (resource hogger). Many thanks though, I'm going to test this tonight on a subset of 50 or so rows to see if its viable. Thanks
 
It all depends on the number of records in your numbers table. The last time i did this was it took about 2.5 times longer on a numbers table with 10 entries and about 5 times longer on a numbers table with 20 entries compared with selecting all the records from the original table. This was done with oracle.
 
It all depends on the number of records in your numbers table. The last time i did this was it took about 2.5 times longer on a numbers table with 10 entries and about 5 times longer on a numbers table with 20 entries compared with selecting all the records from the original table. This was done with oracle.

There are over 300k rows that need to be multiplied on average 22 times into table2... so yeah lol, might fry some components.
 
Hey Guys

Hoping there are some SQL Gurus here. I need some assistance (MySQL specifically):

Context; I have 2 tables:

Table 1
--------
Customer_id | Entries
1 | 3
2 | 4
3 | 2


Basically, I need to duplicate the Customer_id in Table 2 for each Entry they have. Hence Table 2 should look like result set below after the insert:

Customer_id
1
1
1
2
2
2
2
3
3


How would I go about scripting this? Unfortunately I do not have the luxury of using another language (Like creating a Loop in PHP would have been perfect) as I have to do this via command line. Considering that there are 300K+ rows in Table1, manually typing INSERT INTO table2 VALUES X entries is impossible. And yes, I am aware that Table2 will most likely exceed a billion rows after the insert is complete (I have over 40Gb available).

Another issue I may have is that I will be doing this on a production server and may lock up Table1, but I will create a backup of the table first and work from the backup.

Thanks in advance

Hi @Cespian,

At the moment, your Customer_ID is your PK for table 1 i assume?

For the new table, what will be your new PK?

I have an idea for you, but just need to see what the end plan for the structure is...
 
Hi @Cespian,

At the moment, your Customer_ID is your PK for table 1 i assume?

For the new table, what will be your new PK?

I have an idea for you, but just need to see what the end plan for the structure is...

Hey Bud.

There will be no PK nor FK's on any of the tables. The customer_id in table1 is however indexed. It is not necessary for me to have an index or PK on Table2 due to the Customer_id being duplicated plenty times.

The reason I am doing this is to select a winner for a competition we are running at the moment. A customer gets an entry per item he/she purchases. So Table1 has unique customer_id's and a sum of items purchased during that time. Because it is expected that Table2 will be too large to export, the selection will happen internally (not by a 3rd party) because the file will definitely be too large to move around easily (and encryption will be a pain in the rear as well).

EDIT: The create code for the tables were as follows

DROP TABLE IF EXISTS tmp_cust_items;
CREATE TABLE tmp_cust_items
(
customer_id INT UNSIGNED,
items SMALLINT,
INDEX idx_cust_id (customer_id)
);

DROP TABLE IF EXISTS tmp_winner_selection;
CREATE TABLE tmp_winner_selection
(
customer_id
);

Nothing special :)

Thank you for looking into it
 
Hey Bud.

There will be no PK nor FK's on any of the tables. The customer_id in table1 is however indexed. It is not necessary for me to have an index or PK on Table2 due to the Customer_id being duplicated plenty times.

The reason I am doing this is to select a winner for a competition we are running at the moment. A customer gets an entry per item he/she purchases. So Table1 has unique customer_id's and a sum of items purchased during that time. Because it is expected that Table2 will be too large to export, the selection will happen internally (not by a 3rd party) because the file will definitely be too large to move around easily (and encryption will be a pain in the rear as well).

Thank you for looking into it

Okay I fully understand. I do have an awesome vba macro for excel, that did this - however, looking at your DB size I understand what you are looking for.

Ill be right back.
 
@Cespian

One last thing, are you using T-SQL or looking for a "purely" SQL solution?

The lack of loops, without T-SQL means that your "nested if" styled insert, would be repeated 10 times if, the max number of entries per person is 10... Where, a loop can go to the nth position.

If strictly SQL, i am thinking that the solution will require a 3rd "numbers only" table as SQL natively has no means to generate sequential numbers.
 
@Cespian

One last thing, are you using T-SQL or looking for a "purely" SQL solution?

The lack of loops, without T-SQL means that your "nested if" styled insert, would be repeated 10 times if, the max number of entries per person is 10... Where, a loop can go to the nth position.

If strictly SQL, i am thinking that the solution will require a 3rd "numbers only" table as SQL natively has no means to generate sequential numbers.

If memory serves me correctly, T-SQL is native to Sybase and MSSQL. I am working on a MySQL Server. I know that most of the functions are basically the same.
I need to go back to school lol, I have no clue how loops work in MySQL.

I am however able to pass the queries through MySQL Workbench or HeidiSQL or something and have the IDE warn me of any syntax errors or failed returns.
 
If memory serves me correctly, T-SQL is native to Sybase and MSSQL. I am working on a MySQL Server. I know that most of the functions are basically the same.
I need to go back to school lol, I have no clue how loops work in MySQL.

I am however able to pass the queries through MySQL Workbench or HeidiSQL or something and have the IDE warn me of any syntax errors or failed returns.

100% - I write in Microsoft SQL Server Studio so just need to ensure my solution is something you could work with... If its okay, I will play around when I get home and see what I can come up with for you?
 
100% - I write in Microsoft SQL Server Studio so just need to ensure my solution is something you could work with... If its okay, I will play around when I get home and see what I can come up with for you?

Thanks bud, I really appreciate it. I will attempt @yuganp 's solution as well tonight but I doubt the cross joins solution will complete this transaction in anywhere under a month lol... but worth a shot.

I need to complete this by next Friday so I have plenty time, no rush and no urgency, don't lose sleep over it.

I worked on MSSQL exclusively as well until about 1.5 years ago when I moved to a different company... was terrible adjusting to MySQL (and the 2 Mongo instances we have here :hit:)... but ask me to do anything in MSSQLMS now and I will stare into an abyss lol.

Thanks again bud.
 
@Cespian
Dont know if this will work but it may provide insight for a better solotion:

select table1.customer,
sum(table1.entries)
from
table1
order by 1
group by 2
into temp temptable1;

Where temptable1 will have two columns:
customer
total_entries
 
@Cespian
Dont know if this will work but it may provide insight for a better solotion:

select table1.customer,
sum(table1.entries)
from
table1
order by 1
group by 2
into temp temptable1;

Where temptable1 will have two columns:
customer
total_entries

Sorry bud, I dont quite understand what you are doing in the ORDER BY and GROUP BY clauses. Could you please elaborate.

Thanks
 
I'm no SQL guru, but there are a couple of things I consider when faced with a difficult problem;

  • Using a pencil and exam pad to write the billion entries.
  • Delegating the task to someone else.
  • Rather look for new vape gear on the forums.
  • Quitting and becoming a dance instructor.
Hope that helps. :D
 
I'm no SQL guru, but there are a couple of things I consider when faced with a difficult problem;

  • Using a pencil and exam pad to write the billion entries.
  • Delegating the task to someone else.
  • Rather look for new vape gear on the forums.
  • Quitting and becoming a dance instructor.
Hope that helps. :D

If I implement point 4, I could never implement point 3 (mostly because I suck at dancing).
Thanks for the tips... If I dont get this right by my deadline, I will consider these.
 
I sometimes remind myself of what a desparate human I am...

"Would I do 'this' for R10M?" - Yes
"Would I do 'this' for R5M?" - Yes
"Would I do 'this' for R1M?" - Yes



I have come to realize that even R5k questions my morals !

Very simple statement for that attitude:
CASE WHEN money_involved IS NOT NULL THEN 'do it!' ELSE 'fokkof' END AS thats_life_brah
 
Sorry bud, I dont quite understand what you are doing in the ORDER BY and GROUP BY clauses. Could you please elaborate.

Thanks
Group by will produce a sum of the number of entries and the order by will just order it by customer:
SO YOU GET:
CUSTOMER | TOTAL ENTRIES


Bummer I am doing it the other way IT WONT WORK SORRY!
 
Group by will produce a sum of the number of entries and the order by will just order it by customer:
SO YOU GET:
CUSTOMER | TOTAL ENTRIES


Bummer I am doing it the other way IT WONT WORK SORRY!

Thanks anyway bud. Appreciate it.
 
Back
Top