PHP

How to parse a list of URLs using PHP?

PHP

I recently searched how PHP parse_url() function works and found an example on Stackoverflow that caught my eye.

Here is the question being asked.

I try to parse a list of url strings, after two hours of work I don’t reach any result, the list of url strings look like this:

$url_list = array(
    'http://google.com',
    'http://localhost:8080/test/project/',
    'http://mail.yahoo.com',
    'http://www.bing.com',
    'http://www.phpromania.net/forum/viewtopic.php?f=24&t=7549',
    'https://prodgame10.alliances.commandandconquer.com/12/index.aspx',
    'https://prodgame10.alliances.commandandconquer.ro/12/index.aspx',
);

# Output should be
Array
(
    [0] => .google.com
    [1] => .localhost
    [2] => .yahoo.com
    [3] => .bing.com
    [4] => .phpromania.net
    [5] => .commandandconquer.com
)
https://stackoverflow.com/questions/18881291/parsing-complex-urls/74725010#74725010

This question was answered in 2013. I thought about the person’s code and tried to understand. Honesty, to explore his answer I learned many things about PHP.

Here is Alessandro Minoccheri‘s answer:

function getDomain($url) 
{
    $domain = implode('.', array_slice(explode('.', parse_url($url, PHP_URL_HOST)), -2));
    return $domain;
}
# test cases:
foreach ($url_list as $url) {
    $result[] = getDomain($url);
}
https://stackoverflow.com/questions/18881291/parsing-complex-urls/74725010#74725010

I was thinking about how can I simplify the answer and achieve the same result. I came up with a solution that is not actually new rather I refactored his code using PHP array_map() with the arrow function.

Here is my answer:

$domains = array_map(fn($url) => implode('.', array_slice(explode('.', parse_url($url, PHP_URL_HOST)), -2)),$urls);
var_dump($domains);

If you are not sure what this line doing here, I recommend examining the code one by one. I do this in the following way:

$u = 'https://prodgame10.alliances.commandandconquer.ro/12/index.aspx';

$explode = explode('.','prodgame10.alliances.commandandconquer.ro');
//var_dump(($explode));

$slice = array_slice($explode, -2); #  -2 means start at the second last element of the array
//var_dump($slice);

$v = array_slice(explode('.', parse_url($u, PHP_URL_HOST)), -2);
var_dump(($v));
var_dump(implode('.',$v));