Does anybody have a good reference that covers the IE Object Model that I can use to scrape data feom Web pages and put it into Access tables.
![]() |
Patch reliability is unclear. Unless you have an immediate, pressing need to install a specific patch, don't do it. |
SIGN IN | Not a member? | REGISTER | PLUS MEMBERSHIP |
-
IE Object Model (from Access/VBA)
Home » Forums » AskWoody support » Questions: Browsers and desktop software » Internet Explorer and Edge » IE Object Model (from Access/VBA)
- This topic has 39 replies, 3 voices, and was last updated 21 years, 7 months ago.
Viewing 1 reply threadAuthorReplies-
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody LoungerSeptember 16, 2003 at 1:44 am #715273 -
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 3:13 am #715288Pat, when you use the document object model, your document has several collections that could be useful here. Assume you have created an object reference to the HTML document…
Dim myHTMLDoc As MSHTML.HTMLDocument
Set myHTMLDoc = Something that returns an HTML document…not important for current purposes… the one that seems most relevant (and, most specific, which is important to avoid mistakenly targeting some garbage code) is the links collection:
- myHTMLDoc.links.length gives you the count of all links in the entire page; remember that the collection is numbered starting from zero, so the index of the last item in the collection is length-1.
- myHTMLDoc.links.item(0).innerHTML gives you the exact HTML code that is used to generate the visual display associated with the first link; it could be plain text, or text with HTML tags (such as an IMG tag), or just an image tag.
- myHTMLDoc.links.item(0).innerText gives you the visible text, if any, that is associated with the first link; HTML tags are stripped out.
- myHTMLDoc.links.item(0).href gives you the complete path for the first link.[/list]You could loop through the collection looking for a match to the expected “innerText” or use your imagination.
-
WSpatt
AskWoody LoungerSeptember 16, 2003 at 3:45 am #715292Thanks Jefferson, I’m sorry to be such a pest about this but I really need to find out about this.
Now you are talking about a Document Object Model rather than the IE Object Model which you provided some code. That code works very well thank you.
Can the Document Object Model read in tables like the IE Object Model can?
Where can I get some doco to read up on for the Document Object Model?
-
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 5:47 pm #715588The MSHTML library contains Microsoft’s encapsulation of the document object model (DOM). It is largely compliant with the W3C model, but has proprietary extensions such as the .all collection that you will see used frequently in code written for Internet Explorer version 4. In this sense, it is and is not really the Internet Explorer object model.
I hope that sort of clarifies the terminology.
I guess strictly speaking the Internet Explorer object model is the one that contains the InternetExplorer object. I don’t remember the name that appears in the Tools>References dialog, but it could be similar to Microsoft Internet Controls.
-
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 5:47 pm #715589The MSHTML library contains Microsoft’s encapsulation of the document object model (DOM). It is largely compliant with the W3C model, but has proprietary extensions such as the .all collection that you will see used frequently in code written for Internet Explorer version 4. In this sense, it is and is not really the Internet Explorer object model.
I hope that sort of clarifies the terminology.
I guess strictly speaking the Internet Explorer object model is the one that contains the InternetExplorer object. I don’t remember the name that appears in the Tools>References dialog, but it could be similar to Microsoft Internet Controls.
-
WSpatt
AskWoody LoungerSeptember 16, 2003 at 3:45 am #715293Thanks Jefferson, I’m sorry to be such a pest about this but I really need to find out about this.
Now you are talking about a Document Object Model rather than the IE Object Model which you provided some code. That code works very well thank you.
Can the Document Object Model read in tables like the IE Object Model can?
Where can I get some doco to read up on for the Document Object Model?
-
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 5, 2003 at 5:44 am #724448Here’s some sample code:
Option Explicit
‘Declare Sleep API
Private Declare Sub Sleep Lib “kernel32” (ByVal nMilliseconds As Long)Sub RetrieveBODYText()
‘ Jefferson F. Scher 2003-10-04
‘ Uses IE DOM to grab BODY text from web page
‘ SET REFERENCES TO Microsoft HTML Object Library AND Microsoft Internet Controls
‘Create browser object references
Dim ieSrc As New InternetExplorer‘Load page
With ieSrc
.Visible = True ‘show window and load page
.navigate “http://www.microsoft.com/homepage/ms.htm”
While Not .readyState = READYSTATE_COMPLETE
Sleep 500 ‘wait 1/2 sec before trying again
Wend
End With‘Create document object model references
Dim ieDocSrc As MSHTML.HTMLDocument
Set ieDocSrc = ieSrc.Document‘Fetch the BODY Text
Dim strBODYtext As String, strBODYhtml As String, colBODYs As Variant
Set colBODYs = ieDocSrc.all.tags(“BODY”)
If colBODYs.Length = 0 Then
MsgBox “Page has no body (maybe it’s a frameset?)”
Else ‘ get first body
strBODYtext = colBODYs(0).innerText
strBODYhtml = colBODYs(0).innerHTML
Stop ‘inspect vars in the Locals and/or Immediate window
End If‘Clean up objects
Set ieDocSrc = Nothing
ieSrc.Quit
Set ieSrc = Nothing
End Sub -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 7, 2003 at 4:48 am #725301 -
WSpatt
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 7, 2003 at 4:48 am #725302 -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody LoungerOctober 5, 2003 at 5:44 am #724449Here’s some sample code:
Option Explicit
‘Declare Sleep API
Private Declare Sub Sleep Lib “kernel32” (ByVal nMilliseconds As Long)Sub RetrieveBODYText()
‘ Jefferson F. Scher 2003-10-04
‘ Uses IE DOM to grab BODY text from web page
‘ SET REFERENCES TO Microsoft HTML Object Library AND Microsoft Internet Controls
‘Create browser object references
Dim ieSrc As New InternetExplorer‘Load page
With ieSrc
.Visible = True ‘show window and load page
.navigate “http://www.microsoft.com/homepage/ms.htm”
While Not .readyState = READYSTATE_COMPLETE
Sleep 500 ‘wait 1/2 sec before trying again
Wend
End With‘Create document object model references
Dim ieDocSrc As MSHTML.HTMLDocument
Set ieDocSrc = ieSrc.Document‘Fetch the BODY Text
Dim strBODYtext As String, strBODYhtml As String, colBODYs As Variant
Set colBODYs = ieDocSrc.all.tags(“BODY”)
If colBODYs.Length = 0 Then
MsgBox “Page has no body (maybe it’s a frameset?)”
Else ‘ get first body
strBODYtext = colBODYs(0).innerText
strBODYhtml = colBODYs(0).innerHTML
Stop ‘inspect vars in the Locals and/or Immediate window
End If‘Clean up objects
Set ieDocSrc = Nothing
ieSrc.Quit
Set ieSrc = Nothing
End Sub -
WSpatt
AskWoody Lounger -
WSjscher2000
AskWoody Lounger -
WSjscher2000
AskWoody LoungerSeptember 16, 2003 at 3:13 am #715289Pat, when you use the document object model, your document has several collections that could be useful here. Assume you have created an object reference to the HTML document…
Dim myHTMLDoc As MSHTML.HTMLDocument
Set myHTMLDoc = Something that returns an HTML document…not important for current purposes… the one that seems most relevant (and, most specific, which is important to avoid mistakenly targeting some garbage code) is the links collection:
- myHTMLDoc.links.length gives you the count of all links in the entire page; remember that the collection is numbered starting from zero, so the index of the last item in the collection is length-1.
- myHTMLDoc.links.item(0).innerHTML gives you the exact HTML code that is used to generate the visual display associated with the first link; it could be plain text, or text with HTML tags (such as an IMG tag), or just an image tag.
- myHTMLDoc.links.item(0).innerText gives you the visible text, if any, that is associated with the first link; HTML tags are stripped out.
- myHTMLDoc.links.item(0).href gives you the complete path for the first link.[/list]You could loop through the collection looking for a match to the expected “innerText” or use your imagination.
-
WSpatt
AskWoody LoungerSeptember 16, 2003 at 1:44 am #715274 -
WSjscher2000
AskWoody Lounger
-
-
WSpatt
AskWoody Lounger
-
-
-
WSpatt
AskWoody Lounger -
WSKenK
AskWoody Lounger -
WSpatt
AskWoody LoungerSeptember 14, 2003 at 8:31 pm #714670I don’t know if W3C is the ticket but I’ll certainly have a look at this site.
What I want is doco on how to scrape details from a site (this could include tables and other text). What Jefferson provided was an example of how to get the data from a fixed column table on a web page which proved invaluable.
I have modified this somewhat to get what I want, but I would like to be able to access other information on this page. So any other doco on this topic would be extremely valuable. -
WSpatt
AskWoody LoungerSeptember 14, 2003 at 8:31 pm #714671I don’t know if W3C is the ticket but I’ll certainly have a look at this site.
What I want is doco on how to scrape details from a site (this could include tables and other text). What Jefferson provided was an example of how to get the data from a fixed column table on a web page which proved invaluable.
I have modified this somewhat to get what I want, but I would like to be able to access other information on this page. So any other doco on this topic would be extremely valuable.
-
-
WSKenK
AskWoody Lounger
-
-
WSjscher2000
AskWoody Lounger
Viewing 1 reply thread -

Plus Membership
Donations from Plus members keep this site going. You can identify the people who support AskWoody by the Plus badge on their avatars.
AskWoody Plus members not only get access to all of the contents of this site -- including Susan Bradley's frequently updated Patch Watch listing -- they also receive weekly AskWoody Plus Newsletters (formerly Windows Secrets Newsletter) and AskWoody Plus Alerts, emails when there are important breaking developments.
Get Plus!
Welcome to our unique respite from the madness.
It's easy to post questions about Windows 11, Windows 10, Win8.1, Win7, Surface, Office, or browse through our Forums. Post anonymously or register for greater privileges. Keep it civil, please: Decorous Lounge rules strictly enforced. Questions? Contact Customer Support.
Search Newsletters
Search Forums
View the Forum
Search for Topics
Recent Topics
-
Sometimes I wonder about these bots
by
Susan Bradley
2 hours, 42 minutes ago -
Does windows update component store “self heal”?
by
Mike Cross
8 hours, 46 minutes ago -
Windows 11 Insider Preview build 27858 released to Canary
by
joep517
9 hours, 47 minutes ago -
Pwn2Own Berlin 2025: Day One Results
by
Alex5723
9 hours, 12 minutes ago -
Windows 10 might repeatedly display the BitLocker recovery screen at startup
by
Susan Bradley
5 hours, 42 minutes ago -
Windows 11 Insider Preview Build 22631.5409 (23H2) released to Release Preview
by
joep517
12 hours, 28 minutes ago -
Windows 10 Build 19045.5912 (22H2) to Release Preview Channel
by
joep517
12 hours, 30 minutes ago -
Kevin Beaumont on Microsoft Recall
by
Susan Bradley
1 hour, 4 minutes ago -
The Surface Laptop Studio 2 is no longer being manufactured
by
Alex5723
20 hours, 37 minutes ago -
0Patch, where to begin
by
cassel23
14 hours, 39 minutes ago -
CFPB Quietly Kills Rule to Shield Americans From Data Brokers
by
Alex5723
1 day, 10 hours ago -
89 million Steam account details just got leaked,
by
Alex5723
22 hours, 1 minute ago -
KB5058405: Linux – Windows dual boot SBAT bug, resolved with May 2025 update
by
Alex5723
1 day, 18 hours ago -
A Validation (were one needed) of Prudent Patching
by
Nibbled To Death By Ducks
1 day, 9 hours ago -
Master Patch Listing for May 13, 2025
by
Susan Bradley
20 hours, 54 minutes ago -
Installer program can’t read my registry
by
Peobody
3 hours, 35 minutes ago -
How to keep Outlook (new) in off position for Windows 11
by
EspressoWillie
1 day, 7 hours ago -
Intel : CVE-2024-45332, CVE-2024-43420, CVE-2025-20623
by
Alex5723
1 day, 14 hours ago -
False error message from eMClient
by
WSSebastian42
2 days, 5 hours ago -
Awoke to a rebooted Mac (crashed?)
by
rebop2020
2 days, 15 hours ago -
Office 2021 Perpetual for Mac
by
rebop2020
2 days, 16 hours ago -
AutoSave is for Microsoft, not for you
by
Will Fastie
2 hours, 51 minutes ago -
Difface : Reconstruction of 3D Human Facial Images from DNA Sequence
by
Alex5723
2 days, 19 hours ago -
Seven things we learned from WhatsApp vs. NSO Group spyware lawsuit
by
Alex5723
3 hours, 10 minutes ago -
Outdated Laptop
by
jdamkeene
3 days, 1 hour ago -
Updating Keepass2Android
by
CBFPD-Chief115
3 days, 6 hours ago -
Another big Microsoft layoff
by
Charlie
3 days, 6 hours ago -
PowerShell to detect NPU – Testers Needed
by
RetiredGeek
8 hours, 23 minutes ago -
May 2025 updates are out
by
Susan Bradley
10 hours, 5 minutes ago -
Windows 11 Insider Preview build 26200.5600 released to DEV
by
joep517
3 days, 12 hours ago
Recent blog posts
Key Links
Want to Advertise in the free newsletter? How about a gift subscription in honor of a birthday? Send an email to sb@askwoody.com to ask how.
Mastodon profile for DefConPatch
Mastodon profile for AskWoody
Home • About • FAQ • Posts & Privacy • Forums • My Account
Register • Free Newsletter • Plus Membership • Gift Certificates • MS-DEFCON Alerts
Copyright ©2004-2025 by AskWoody Tech LLC. All Rights Reserved.