Wget, wasu misalai na abin da za'a iya yi tare da wannan kayan aikin

Game da wget

A cikin labarin na gaba zamuyi la'akari da Wget. Dole ne a ce GNU Wget ne mai kayan aiki kyauta wannan yana ba da damar zazzage abubuwan daga sabar yanar gizo a hanya mai sauƙi da sauri. Sunanta ya samo asali ne daga Gidan yanar gizo na Duniya (w) da kalmar get (a Turanci sa). Wannan sunan ya zo yana nufin: samu daga WWW.

A yau akwai aikace-aikace da yawa don saukar da fayiloli sosai da kyau. Mafi yawansu suna dogara ne akan musaya ta yanar gizo da kuma tebur, kuma an haɓaka su don duk tsarin aiki. Koyaya akan Gnu / Linux (akwai kuma sigar don Windows) akwai mai sarrafa mai saukar da iko na fayilolin wget. An yi la'akari da mafi saukakakakakkun abin da ya wanzu Goyan bayan ladabi irin su http, https da ftp.

Zazzage fayiloli tare da wget

Zazzage fayil

Hanya mafi sauki don amfani da wannan kayan aiki shine zazzagewa yana nuna fayil Abin da muke so:

wget http://sitioweb.com/programa.tar.gz

Zazzage ta amfani da ladabi daban-daban

A matsayin mai sarrafa saukar da kyau, yana yiwuwa nema sama da sau ɗaya a lokaci guda. Hakanan zamu iya amfani da ladabi daban-daban a cikin tsari ɗaya:

wget http://sitioweb.com/programa.tar.gz ftp://otrositio.com/descargas/videos/archivo-video.mpg

Download ta hanyar kari

Wata hanyar sauke abubuwa da yawa fayilolin da suke amfani da tsawo iri ɗaya, zai yi amfani da alama alama:

wget<code class="language-bash" data-lang="bash">-r -A.pdf</code>http://sitioweb.com/*.pdf

Wannan umarnin baya aiki koyaushe, kamar yadda wasu sabobin na iya toshe damar shiga wget.

Zazzage jerin fayil

Idan abin da muke so shine zazzage fayilolin da muke nema, kawai zamu adana su URL a cikin fayil. Za mu ƙirƙiri jerin da ake kira fayiloli.txt kuma za mu nuna sunan jerin zuwa umarnin. Ya zama dole sanya url ɗaya kawai a kowane layi cikin fayiloli.txt.

Umurnin da zamuyi amfani dashi don zazzage jerin abubuwan da aka kirkira kuma muke ajiyewa a cikin fayiloli.

wget -i archivos.txt

Sake kunnawa

Idan da kowane irin dalili ne aka katse saukowar, to za mu iya ci gaba da saukarwa daga inda aka tsaya amfani da zaɓi c tare da umarnin wget:

wget -i -c archivos.txt

Sanya log game da zazzagewa

Idan muna so mu sami log game da zazzagewa, don sarrafa duk wani abin da ya faru a kai, dole ne mu ƙara da -o zaɓi kamar yadda aka nuna a cikin masu zuwa:

wget -o reporte.txt http://ejemplo.com/programa.tar.gz

Iyakance saukar da bandwidth

A cikin sauyi da yawa zamu iya iyakance saukar da bandwidth. Wannan zai hana saukarwar daga ɗaukar dukkan bandwidth na tsawon lokacin saukarwar:

wget -o /reporte.log --limit-rate=50k ftp://ftp.centos.org/download/centos5-dvd.iso

Zazzage tare da sunan mai amfani da kalmar wucewa

Idan muna son saukarwa daga wani shafi inda ake buƙatar sunan mai amfani / kalmar wucewa, kawai zamuyi amfani da waɗannan zaɓuɓɓukan:

wget --http-user=admin --http-password=12345 http://ejemplo.com/archivo.mp3

Zazzage ƙoƙari

Tsohuwa, wannan shirin yayi ƙoƙarin 20 don kafa haɗin kuma fara zazzagewa, a cikin shafukan yanar gizo sosai zai iya yiwuwa koda tare da ƙoƙari 20 ba'a samu ba. Tare da zaɓi t ƙara zuwa ƙarin ƙoƙari.

wget -t 50 http://ejemplo.com/pelicula.mpg

Zazzage gidan yanar gizo tare da wget

Wget mutum taimako

Wget mutum taimako

Wget ba'a iyakance shi ba ne kawai don sauke fayiloliZa mu iya sauke cikakken shafi. Za mu kawai rubuta wani abu kamar:

wget www.ejemplo.com

Zazzage gidan yanar gizo da abubuwanda ke cikin sa

Tare da zaɓi p zamu kuma zazzage duka elementsarin abubuwan da ake buƙata akan shafin kamar zanen gado, hotuna masu layi, da sauransu.

Idan muka hada da zaɓi r se zai sauke recursively har zuwa 5 matakan daga shafin:

wget -r www.ejemplo.com -o reporte.log

Sanya hanyoyin zuwa gida

Ta hanyar tsoho, hanyoyin haɗin yanar gizon suna nuna adireshin duk yankin. Idan mun zazzage shafin sau da kafa sannan kuma muyi nazarin shi ba tare da layi ba, zamu iya amfani da shi zaɓi-hanyar sauyawa hakan zai mayar dasu hanyoyin gida:

wget --convert-links -r http://www.sitio.com/

Samu cikakken kwafin shafin

Za mu sami damar samun cikakken kwafin shafin. Da -Mirror zaɓi daidai yake da amfani da zaɓuɓɓuka -r -l inf -N wanda ke nuna sake dawowa a matakin mara iyaka da samun asalin timestamp na kowane fayil da aka sauke.

wget --mirror http://www.sitio.com/

Canza kari

Idan kun zazzage dukkan rukunin yanar gizon don kallon shi a wajen layi, fayilolin da aka zazzage da yawa ba za su buɗe ba, saboda ƙarin abubuwa kamar .cgi, .asp, ko .php. Sannan yana yiwuwa a nuna tare da –Html-tsawo zaɓi Duk fayiloli an canza su zuwa tsawo .html.

wget --mirror --convert-links --html-extension http://www.ejemplo.com

Waɗannan ƙa'idodi ne na gaba ɗaya fiye da yadda zaka iya yi da Wget. Duk wanda yake so zai iya tuntubar littafin kan layi don tuntuɓar duk damar da wannan mai saukar da mai saukar da kyauta yayi mana.


7 comments, bar naka

Bar tsokaci

Your email address ba za a buga. Bukata filayen suna alama da *

*

*

  1. Wanda ke da alhakin bayanan: Miguel Ángel Gatón
  2. Manufar bayanan: Sarrafa SPAM, sarrafa sharhi.
  3. Halacci: Yarda da yarda
  4. Sadarwar bayanan: Ba za a sanar da wasu bayanan ga wasu kamfanoni ba sai ta hanyar wajibcin doka.
  5. Ajiye bayanai: Bayanin yanar gizo wanda Occentus Networks (EU) suka dauki nauyi
  6. Hakkoki: A kowane lokaci zaka iyakance, dawo da share bayanan ka.

  1.   Ruben Cardenal m

    Amma ga "Zazzagewa ta hanyar kari" Na daina karantawa. Ba za ku iya sauke abin da ba ku sani ba. Sai dai idan kundin adireshin da aka nema ya ba da damar jerin fayiloli kuma ba shi da fihirisa (kuma dukansu dole ne su faru a lokaci guda), abin da kuka ce ba za a iya yi ba. Menene matakin.

    1.    Computer ba a sani ba m

      Sannu Rubén, jahilci yana da ɗan tsoro.
      Abin da kuka yi sharhi za a iya yi tare da umarni mai sauƙi zuwa google:
      filetype:pdf site:ubunlog.com
      A cikin wannan misalin babu pdf a cikin wannan rukunin yanar gizon, amma canza yankin a ƙarshen zuwa gidan yanar gizon da kuka fi so kuma zaku ga yadda yake da sauƙi don ganin duk fayilolin nau'in gidan yanar gizo.
      Yi kyau rana.

      1.    Yin magana m

        Amma wget baya hadawa da google dan neman pdfs da suke cikin url. Dole ne adireshin gidan yanar gizo ya kasance a buɗe kuma dole ne ya zama shafi na nuni da mod_autoindex ya samar ko makamancin haka, kamar yadda Rubén Cardenal ya ce

    2.    Jimmy olano m

      "Wannan umarnin ba koyaushe yake aiki ba, kamar yadda wasu sabobin na iya toshe hanyar shiga wget."
      Wannan kwaskwarimar da aka sanya akan wannan labarin, kamar yadda ban yarda da ita ba (kodayake a fasaha yana yiwuwa a toshe wasu wakilan yanar gizo don buƙatun taken kai na http sannan a dawo da saƙo 403 "ba a yarda ba") kuma zan bayyana dalilin:

      Duk sabobin yanar gizo na Apache (kuma ina magana ne game da adadi mai yawa na sabobin) ta hanyar tsoho suna ba da izini ga duniya (kyakkyawan labarin Wikipedia, karanta: https://es.wikipedia.org/wiki/Glob_(inform%C3%A1tica) .

      Wannan a aikace yana nufin, kamar yadda mr. Rubén (kuma yana da gaskiya), IDAN BABU WANI FILI DA AKA KIRA "index.php" ko "index.html" (ko ma kawai ana kiransa "index") uwar garken zai dawo cikin nutsuwa ya dawo da jerin fayiloli da kundayen adireshi (tabbas nau'ikan shafi na HTML tare da bayanin azaman hanyar haɗin yanar gizo don kowane fayil). MAFI YAWAN BAUTA SUNA KASHE WANNAN SIFFOFIN TA .htacces FILE (mai magana sosai da Apache2) DOMIN DALILAN TSARO

      Anan ne yanayin wget (duba labarinsa, a sake akan Wikipedia, wanda kuka fi sani da shi: https://es.wikipedia.org/wiki/GNU_Wget ) don yin nazari ko "ɓoye" waɗannan bayanan kuma cire ƙarin abubuwan da muke nema kawai.

      Yanzu idan wannan ba ya aiki, saboda wani dalili ko wata, za mu iya gwada sauran ayyukan ci gaba na wget, ina faɗi kai tsaye cikin Turanci:

      Kuna so zazzage dukkan GIF daga shugabanci akan sabar HTTP. Kun gwada 'wget http://www.example.com/dir/*.gif’, amma hakan bai yi aiki ba saboda sake dawo da HTTP baya tallafawa GLOBBING (Na sanya manyan haruffa). A wannan yanayin, yi amfani da:

      wget -r -l1 –babu-mahaifa -A.gif http://www.example.com/dir/

      Verarin magana, amma tasirin iri ɗaya ne. '-r -l1' na nufin dawo da maimaitawa (duba Sauke Saukewa), tare da zurfin zurfin 1. '–nana-mahaifa' yana nufin cewa ba a kula da nassoshi ga kundin adireshin iyaye (duba Limayyadaddun Directory), da '-A. gif 'na nufin zazzage fayilolin GIF kawai. '-A «* .gif»' da ma zai yi aiki.

      IDAN KA GUDU A WANNAN HANYA TA getAYA wget zata kirkiri mana babban fayil tare da adireshin gidan yanar sadarwar da muke nema a cikin babban fayil din da muke aiki, kuma zai sanya kananan hukumomi idan ya zama dole kuma a can za a sanya su, misali, hotunan .gif da muke nema.

      --------
      KO yaya idan har yanzu ba zai yuwu a sami wasu nau'ikan fayiloli ba (* .jpg, misali) dole ne muyi amfani da siga «–page-requisites» wanda ke zazzage dukkan abubuwan ciki na shafin html (hotuna, sautuna, css, da sauransu) tare da shafin html kanta ("–page-requisites" ana iya gajarta ta "-p") kuma hakan zai yi daidai da zazzage abu kamar "mhtml" https://tools.ietf.org/html/rfc2557

      Ina fatan wannan bayanin yana da amfani a gare ku.

      1.    Damian Amoedo m

        Godiya ga bayanin kula. Salu2.

  2.   Bayan banki m

    Ina tsammanin kuna da kuskure, layuka biyu na farko suna da umarni iri ɗaya.

  3.   Mike m

    Na gode sosai, kwarai da gaske koyawa!