Autor Beitrag
Holonet
Hält's aus hier
Beiträge: 15



BeitragVerfasst: Di 31.01.12 12:02 
Hallo Leute

Ich versuche gerade ein CSV-File zu parsen. Die Daten sollen dann auf den SQLServer in ne Datenbank. Allerdings hat das CSV-File so seine Tücken. Zum Beispiel enthält es Kommas, die jedoch nicht einen Spaltenwechsel bedeuten und es besitzt mehrere Metadatenzeilen, also mehrere Tabellen.

Hier einen Auszug. Es sind Messdaten einer Messmaschine. Ich musste einige Informationen entfernen, da sie nicht ins Internet gelangen sollten. Ich habe jedoch die Struktur unangetastet gelassen:
ausblenden volle Höhe Quelltext
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17:
18:
19:
20:
21:
22:
23:
24:
25:
26:
27:
28:
29:
30:
31:
32:
33:
34:
35:
36:
37:
38:
39:
40:
41:
42:
43:
44:
45:
46:
47:
48:
"Technician Name:","Test Name:","SPALTE:","Descriptions:","Notes:","Test Type","Speed","Averaging","Distance Eng Units","Force Eng Units","Encoder Enabled"
"USERNAME, USERROLE","TESTDESCRIPTION","","TESTING","","TESTTYPE","N/A","100","MM","N","SPALTE",""
" WERTESPALTE","WERTESPALTE","WERTESPALTE","WERTESPALTE","WERTESPALTE","WERTESPALTE","WERTESPALTE","WERTESPALTE","WERTESPALTE","WERTESPALTE"
"0.079","0","0.0462","0","0","0.0000","0.0000","0.0000","0.0000","0.01563"
.187569273743017,"0",4.36291960165169E-02,0,0,"0.5636",.5,"0.0000","0.0000","0.18675"
.38138598,"0",.04171512,0,0,"1.2637",1,"0.0000","0.0000","0.27728"
.44818825,"0",.039,0,0,"1.7031",1.5,"0.0000","0.0000","0.34262"
.345362,"0",.0399108,0,0,"2.1469",2,"0.0000","0.0000","0.40606"
.26328688,"0",.043376205,0,0,"2.7465",2.5,"0.0000","0.0000","0.48366"
.241412128,"0",.039141912,0,0,"3.2001",3,"0.0000","0.0000","0.54661"
.2221,"0",.03580851,0,0,"3.8150",3.5,"0.0000","0.0000","0.62421"
.234712682857143,"0",3.84180857142857E-02,0,0,"4.2766",4,"0.0000","0.0000","0.68695"
.22758673,"0",.046107075,0,0,"4.7452",4.5,"0.0000","0.0000","0.74968"
.232897564444444,"0",.03728232,0,0,"5.2139",5,"0.0000","0.0000","0.81226"
.24243526,"0",.0389631,0,0,"5.6899",5.5,"0.0000","0.0000","0.87487"
.245615592727273,"0",4.34978072727273E-02,0,0,"6.1659",6,"0.0000","0.0000","0.93746"
.247525976666667,"0",.040821165,0,0,"6.8033",6.5,"0.0000","0.0000","1.03052"
.251911486153846,"0",4.26185261538462E-02,0,0,"7.2867",7,"0.0000","0.0000","1.09343"
.25942647,"0",.03636156,0,0,"7.7698",7.5,"0.0000","0.0000","1.15598"
.265972182666667,"0",5.04203826666667E-02,0,0,"8.2680",8,"0.0000","0.0000","1.21849"
.2873657825,"0",.0551845775,0,0,"8.7661",8.5,"0.0000","0.0000","1.29639"
.282033270588235,"0",5.69999529411765E-02,0,0,"9.2640",9,"0.0000","0.0000","1.35895"
.28726064,"0",.05687786,0,0,"9.7621",9.5,"0.0000","0.0000","1.42169"
.296783538947368,"0",5.49933347368421E-02,0,0,"10.2674",10,"0.0000","0.0000","1.48421"
.295752872,"0",.05321232,0,0,"10.7655",10.5,"0.0000","0.0000","1.56216"
"Results","SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","
"
"Results",3.2067,28,0,0,1.39297846091045,0,0,0,0,0,1.67728609090909,28,0,0,.63537496711157,"
"
"SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","
"
"36","0","0","0","0","
"
"SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","
"
"450","100","100","100","100","
"
"SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","
"
"10000","10000","10000","10000","10000","10000","
"
"SPALTE","SPALTE","SPALTE","SPALTE","SPALTE","
"
0,0,0,0,0,"","
"
"Profile Name","Date","Time","
"
"PROFILENAME","01-01-1900","00:00:00",


Habt ihr mir einen Tipp wie ich dieses File parsen kann? Die Anzahl der Testergebnisse(die Zeilen mit den Zahlen) sind variabel.
Wie gesagt, die Sachen, die mir Schwierigkeiten bereiten ist das Komma, das keine neue Spalte anzeigt(Zeile 2), die verschiedenen Tabellen(Metadatenzeilen jeweils auf Zeile 1, 3, 26, 30, 34, 38, 42 und 46) sowie die Anführungszeichen, die bei den Zahlen nur zum Teil gesetzt sind und aber auch den Zeilenumbruch der sich in Anführungszeichen befindet.
daeve
ontopic starontopic starontopic starontopic starontopic starontopic starontopic starhalf ontopic star
Beiträge: 116
Erhaltene Danke: 3

Windows (XP Pro, 7 Ultimate x64)
C#,WPF,Java,ASP.Net, VS 2010 Ultimate (x86)
BeitragVerfasst: Di 31.01.12 19:35 
wenn es in dem CSV keine identifizierenden "Wörter" oder so was ähnliches gibt, wüsste ich auch nicht wie..
Th69
ontopic starontopic starontopic starontopic starontopic starontopic starontopic starontopic star
Moderator
Beiträge: 4799
Erhaltene Danke: 1059

Win10
C#, C++ (VS 2017/19/22)
BeitragVerfasst: Di 31.01.12 21:07 
Hallo Holonet,

meinst du das Komma in "USERNAME, USERROLE"? Dies kann eigentlich jeder CVS-Reader vernünftig behandeln, da Anführungsstriche Vorrang haben.
Probiere mal A Fast CSV Reader.

Und die verschiedenen Header lassen sich entweder nur über eindeutige Begriffe (Schlüsselwörter) lösen oder aber evtl. in deinem Fall über die auffälligen
ausblenden Quelltext
1:
2:
...,"
"

(also einen Zeilenwechsel innerhalb der Anführungsstriche als letzte Spalte).