Einzelnen Beitrag anzeigen
Alt 11.01.10, 18:57   #1 (permalink)
easteregg
Member of Honour
 
Benutzerbild von easteregg
 
Registriert seit: 14.09.07
easteregg Leistung: Pentium Ieasteregg Leistung: Pentium I
easteregg eine Nachricht über ICQ schicken
Likes: 62
erledigt preg_match und/oder empty() hilfe

heyho

ich will die konzerte von myspaceseiten parsen...

hier mal das was ich bis dahin hab (meine regexp kenntnisse sind arg bescheiden :|)

Code:
<?php
    $site = file_get_contents("http://www.myspace.com/sonoramilagrosa");
    preg_match('~<div id="profile_bandschedule">(.*)</div>~Us',$site,$hits);
    unset($hits[0]);

    if (isset($hits[1])) {
        preg_match_all('~.*>.*>?(.*)<.*~U',$hits[1],$hits);
        unset($hits[0]);
    }
    
    $hits = $hits[1];
    
    for($i=0; $i < count($hits); $i++) {
        if (empty($hits[$i])) unset($hits[$i]);
    }
    var_dump($hits);
?>
zu parsen gibts folgendes:

Code:
<div id="profile_bandschedule">
<table bordercolor="#6699cc" cellspacing="0" cellpadding="0" width="440" bgcolor="#6699cc" border="0">

  <tr>
    <td>
      <table width="440" border="0" cellspacing="0" cellpadding="0">
        <tr>
          <td bgcolor="#6699cc" class="text" align="left" style="WORD-WRAP:break-word">&nbsp;&nbsp;&nbsp;<span class="whitetext12">Anstehende Konzerte</span></td>
          <td align="right"><font color="#ffffff" size="2" face="Arial, Helvetica, sans-serif"><span align="right" class="whitelink"><font size="1">( <a href="http://collect.myspace.com/index.cfm?fuseaction=bandprofile.listAllShows&friendid=430662499&n=Sonora+Milagrosa" class="whitelink">Alle zeigen</a> )</font></span></font></td>

        </tr>
      </table>
    </td>
  </tr>
  <tr>
    <td style="PADDING-RIGHT: 3px; PADDING-LEFT: 3px; PADDING-BOTTOM: 3px; PADDING-TOP: 3px">
  
  
      <table width="440" border="0" cellspacing="2" cellpadding="2" bgcolor="#ffffff">
  
  
        <tr>
          <td width="120" bgcolor="#b1DOfO">

            <table width="120" border="0" cellspacing="2" cellpadding="0">
              <tr>
                
                <td width="85"><font size="1" face="Arial, Helvetica, sans-serif">31. Dez. 2009</font></td>
                
                <td width="35" align="right"><font size="1" face="Arial, Helvetica, sans-serif">23:00</font></td>
              </tr>
            </table>
          </td>
          <td width="191" bgcolor="#d5e8fb"><font size="1" face="Arial, Helvetica, sans-serif"><a href="http://music.myspace.com/index.cfm?fuseaction=music.showDetails&friendid=430662499&Band_Show_ID=49984">SONORA MILAGROSA  @ DONOSTI - SAN SEBASTIAN</a></font></td>

          <td width="115" bgcolor="#d5e8fb"><font size="1" face="Arial, Helvetica, sans-serif">Guipúzcoa</font></td>
        </tr>
  
        <tr>
          <td width="120" bgcolor="#b1DOfO">
            <table width="120" border="0" cellspacing="2" cellpadding="0">
              <tr>
                
                <td width="85"><font size="1" face="Arial, Helvetica, sans-serif">02. Jan. 2010</font></td>
                
                <td width="35" align="right"><font size="1" face="Arial, Helvetica, sans-serif">20:00</font></td>

              </tr>
            </table>
          </td>
          <td width="191" bgcolor="#d5e8fb"><font size="1" face="Arial, Helvetica, sans-serif"><a href="http://music.myspace.com/index.cfm?fuseaction=music.showDetails&friendid=430662499&Band_Show_ID=50829">Sonora Milagrosa Sound System @ Pub Leize Gorria</a></font></td>
          <td width="115" bgcolor="#d5e8fb"><font size="1" face="Arial, Helvetica, sans-serif">Donosti, Guipúzcoa</font></td>
        </tr>
  
        <tr>
          <td width="120" bgcolor="#b1DOfO">

            <table width="120" border="0" cellspacing="2" cellpadding="0">
              <tr>
                
                <td width="85"><font size="1" face="Arial, Helvetica, sans-serif">16. Jan. 2010</font></td>
                
                <td width="35" align="right"><font size="1" face="Arial, Helvetica, sans-serif">23:00</font></td>
              </tr>
            </table>
          </td>
          <td width="191" bgcolor="#d5e8fb"><font size="1" face="Arial, Helvetica, sans-serif"><a href="http://music.myspace.com/index.cfm?fuseaction=music.showDetails&friendid=430662499&Band_Show_ID=48814">TERRABEATS FESTIVAL Festsaal Kreuzberg - Sonora Milagrosa, Shazalakazoo, Gankino Circus & Meniak</a></font></td>

          <td width="115" bgcolor="#d5e8fb"><font size="1" face="Arial, Helvetica, sans-serif">Berlin</font></td>
        </tr>
  
        <tr>
          <td width="120" bgcolor="#b1DOfO">
            <table width="120" border="0" cellspacing="2" cellpadding="0">
              <tr>
                
                <td width="85"><font size="1" face="Arial, Helvetica, sans-serif">03. Feb. 2010</font></td>
                
                <td width="35" align="right"><font size="1" face="Arial, Helvetica, sans-serif">21:00</font></td>

              </tr>
            </table>
          </td>
          <td width="191" bgcolor="#d5e8fb"><font size="1" face="Arial, Helvetica, sans-serif"><a href="http://music.myspace.com/index.cfm?fuseaction=music.showDetails&friendid=430662499&Band_Show_ID=49991">SONORA MILAGROSA + INTICHE - NIM ALAE MC @ ANMESTY INTERNATIONAL  AKTIONEN - TBC HAMBURG</a></font></td>
          <td width="115" bgcolor="#d5e8fb"><font size="1" face="Arial, Helvetica, sans-serif">Hamburg</font></td>
        </tr>
  
        <tr>
          <td width="120" bgcolor="#b1DOfO">

            <table width="120" border="0" cellspacing="2" cellpadding="0">
              <tr>
                
                <td width="85"><font size="1" face="Arial, Helvetica, sans-serif">21. Mai. 2010</font></td>
                
                <td width="35" align="right"><font size="1" face="Arial, Helvetica, sans-serif">19:00</font></td>
              </tr>
            </table>
          </td>
          <td width="191" bgcolor="#d5e8fb"><font size="1" face="Arial, Helvetica, sans-serif"><a href="http://music.myspace.com/index.cfm?fuseaction=music.showDetails&friendid=430662499&Band_Show_ID=50610">SONORA MILAGROSA KONZERT @ KARNEVAL DER KULTUREN</a></font></td>

          <td width="115" bgcolor="#d5e8fb"><font size="1" face="Arial, Helvetica, sans-serif">Berlin</font></td>
        </tr>
  
      </table>
      
    </td>
  </tr>
</table>
</div>
das wird mit dem ersten preg_match erschlagen.
wenn ich die for() schleife rauslasse bekomm ich dann in etwa folgenden output

Code:
array(82) {
  [0]=>
  string(18) "&nbsp;&nbsp;&nbsp;"
  [1]=>
  string(19) "Anstehende Konzerte"
  [2]=>
  string(0) ""
  [3]=>
  string(0) ""
  [4]=>
  string(0) ""
  [5]=>
  string(0) ""
  [6]=>
  string(2) "( "
  [7]=>
  string(11) "Alle zeigen"
  [8]=>
  string(2) " )"
  [9]=>
  string(0) ""
  [10]=>
  string(0) ""
  [11]=>
  string(0) ""
  [12]=>
  string(0) ""
  [13]=>
  string(13) "31. Dez. 2009"
  [14]=>
  string(0) ""
  [15]=>
  string(0) ""
  [16]=>
  string(5) "23:00"
  [17]=>
  string(0) ""
  [18]=>
  string(0) ""
  [19]=>
  string(0) ""
  [20]=>
  string(43) "SONORA MILAGROSA  @ DONOSTI - SAN SEBASTIAN"
  [21]=>
  string(0) ""
  [22]=>
  string(0) ""
  [23]=>
  string(0) ""
  [24]=>
  string(10) "Guip├║zcoa"
  [25]=>
  string(0) ""
  [26]=>
  string(0) ""
  [27]=>
  string(13) "02. Jan. 2010"
  [28]=>
  string(0) ""
  [29]=>
  string(0) ""
  [30]=>
  string(5) "20:00"
  [31]=>
  string(0) ""
  [32]=>
  string(0) ""
  [33]=>
  string(0) ""
  [34]=>
  string(48) "Sonora Milagrosa Sound System @ Pub Leize Gorria"
  [35]=>
  string(0) ""
  [36]=>
  string(0) ""
  [37]=>
  string(0) ""
  [38]=>
  string(19) "Donosti, Guip├║zcoa"
  [39]=>
  string(0) ""
  [40]=>
  string(0) ""
  [41]=>
  string(13) "16. Jan. 2010"
  [42]=>
  string(0) ""
  [43]=>
  string(0) ""
  [44]=>
  string(5) "23:00"
  [45]=>
  string(0) ""
  [46]=>
  string(0) ""
  [47]=>
  string(0) ""
  [48]=>
  string(96) "TERRABEATS FESTIVAL Festsaal Kreuzberg - Sonora Milagrosa, Shazalakazoo, Gankino Circus & Meniak"
  [49]=>
  string(0) ""
  [50]=>
  string(0) ""
  [51]=>
  string(0) ""
  [52]=>
  string(6) "Berlin"
  [53]=>
  string(0) ""
  [54]=>
  string(0) ""
  [55]=>
  string(13) "03. Feb. 2010"
  [56]=>
  string(0) ""
  [57]=>
  string(0) ""
  [58]=>
  string(5) "21:00"
  [59]=>
  string(0) ""
  [60]=>
  string(0) ""
  [61]=>
  string(0) ""
  [62]=>
  string(88) "SONORA MILAGROSA + INTICHE - NIM ALAE MC @ ANMESTY INTERNATIONAL  AKTIONEN - TBC HAMBURG"
  [63]=>
  string(0) ""
  [64]=>
  string(0) ""
  [65]=>
  string(0) ""
  [66]=>
  string(7) "Hamburg"
  [67]=>
  string(0) ""
  [68]=>
  string(0) ""
  [69]=>
  string(13) "21. Mai. 2010"
  [70]=>
  string(0) ""
  [71]=>
  string(0) ""
  [72]=>
  string(5) "19:00"
  [73]=>
  string(0) ""
  [74]=>
  string(0) ""
  [75]=>
  string(0) ""
  [76]=>
  string(48) "SONORA MILAGROSA KONZERT @ KARNEVAL DER KULTUREN"
  [77]=>
  string(0) ""
  [78]=>
  string(0) ""
  [79]=>
  string(0) ""
  [80]=>
  string(6) "Berlin"
  [81]=>
  string(0) ""
}
wenn ich mit empty nachprüfe, ob die variable leer ist und dann ggf lösche werden nur die ersten daten rausgekickt. die letzten leeren einträge bleiben immer da?!

Code:
D:\coding\myspace bandfeed>php run.php
array(49) {
  [0]=>
  string(18) "&nbsp;&nbsp;&nbsp;"
  [1]=>
  string(19) "Anstehende Konzerte"
  [6]=>
  string(2) "( "
  [7]=>
  string(11) "Alle zeigen"
  [8]=>
  string(2) " )"
  [13]=>
  string(13) "31. Dez. 2009"
  [16]=>
  string(5) "23:00"
  [20]=>
  string(43) "SONORA MILAGROSA  @ DONOSTI - SAN SEBASTIAN"
  [24]=>
  string(10) "Guip├║zcoa"
  [27]=>
  string(13) "02. Jan. 2010"
  [30]=>
  string(5) "20:00"
  [34]=>
  string(48) "Sonora Milagrosa Sound System @ Pub Leize Gorria"
  [38]=>
  string(19) "Donosti, Guip├║zcoa"
  [41]=>
  string(13) "16. Jan. 2010"
  [44]=>
  string(5) "23:00"
  [48]=>
  string(96) "TERRABEATS FESTIVAL Festsaal Kreuzberg - Sonora Milagrosa, Shazalakazoo, Gankino Circus & Meniak"
  [49]=>
  string(0) ""
  [50]=>
  string(0) ""
  [51]=>
  string(0) ""
  [52]=>
  string(6) "Berlin"
  [53]=>
  string(0) ""
  [54]=>
  string(0) ""
  [55]=>
  string(13) "03. Feb. 2010"
  [56]=>
  string(0) ""
  [57]=>
  string(0) ""
  [58]=>
  string(5) "21:00"
  [59]=>
  string(0) ""
  [60]=>
  string(0) ""
  [61]=>
  string(0) ""
  [62]=>
  string(88) "SONORA MILAGROSA + INTICHE - NIM ALAE MC @ ANMESTY INTERNATIONAL  AKTIONEN - TBC HAMBURG"
  [63]=>
  string(0) ""
  [64]=>
  string(0) ""
  [65]=>
  string(0) ""
  [66]=>
  string(7) "Hamburg"
  [67]=>
  string(0) ""
  [68]=>
  string(0) ""
  [69]=>
  string(13) "21. Mai. 2010"
  [70]=>
  string(0) ""
  [71]=>
  string(0) ""
  [72]=>
  string(5) "19:00"
  [73]=>
  string(0) ""
  [74]=>
  string(0) ""
  [75]=>
  string(0) ""
  [76]=>
  string(48) "SONORA MILAGROSA KONZERT @ KARNEVAL DER KULTUREN"
  [77]=>
  string(0) ""
  [78]=>
  string(0) ""
  [79]=>
  string(0) ""
  [80]=>
  string(6) "Berlin"
  [81]=>
  string(0) ""
}
ich steh grad komplett aufn schlauch. weder schaff ich es den regexp besser anzupassen, dass ich erst gar nicht filtern müsste.
andererseits versteh ich das verhalten mit der empty() schleife nicht, weil die ganzen leerdinger sind auch wirklich leer, wenn man die sich mit var_dump ausgeben lösst.
wieso werden die also nicht gelöscht?

ansonst, php5.3 cli verwendet!
__________________
» Flattr mich! - Wenn dir mein Beitrag geholfen hat! «
<| 2 AMD Opterons 2384@ 8x3,2ghz | Tyan S2915 | 10GB | 2x 8800GT | 8400GS | Dell 3008WFP + 2x2007FP |>

Geändert von easteregg (11.01.10 um 19:18 Uhr)
easteregg ist offline   Mit Zitat antworten
 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61