C# 如何使用 windows 应用程序在第三方网站上填写和提交网络表单?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 
原文地址: http://stackoverflow.com/questions/17529276/
Warning: these are provided under cc-by-sa 4.0 license.  You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to fill and submit a web form on a third party website using windows application?
提问by Somi
I am doing a project in which I have to make a windows application that can Take a URL in textbox from user. Now when the user press the Proceed button, the application should open that URl in a webbrowser control and fill the form on that page containing userID & password textboxes and submit it via the login button on that web page. Now my application should show the next page in that webbrowser control to the user.
我正在做一个项目,我必须在其中创建一个 Windows 应用程序,该应用程序可以从用户的文本框中获取 URL。现在,当用户按下 Proceed 按钮时,应用程序应该在 webbrowser 控件中打开该 URl,并在该页面上填写包含用户 ID 和密码文本框的表单,然后通过该网页上的登录按钮提交。现在我的应用程序应该向用户显示该 webbrowser 控件中的下一页。
I can open the url in the application's webbrowser control through my C# Code, but I can't figure it out that how to find the userID & pasword textboxes on that web page that is currently opened in the webbrowser control of my application, how to fill them, how to find the login button & how to click it through my C# Code.
我可以通过我的 C# 代码打开应用程序的 webbrowser 控件中的 url,但我无法弄清楚如何在我的应用程序的 webbrowser 控件中当前打开的那个网页上找到用户 ID 和密码文本框,如何填写它们,如何找到登录按钮以及如何通过我的 C# 代码单击它。
回答by Arran
回答by Nemo
You don't have to simulate filling in the username/password fields nor clicking on the login button. You need to simulate the browser rather than the user.
您不必模拟填写用户名/密码字段或单击登录按钮。您需要模拟浏览器而不是用户。
Read the login page html and parse it to find the ids of the username and password fields. The username can be obtained by looking for tags with name set as "username", "user", "login", etc. The password will usually be an tag with type="password". Javascript based popup panels for login would involve parsing the js.
阅读登录页面 html 并解析它以找到用户名和密码字段的 id。可以通过查找名称设置为“username”、“user”、“login”等的标签来获取用户名。密码通常是 type="password" 的标签。用于登录的基于 Javascript 的弹出面板将涉及解析 js。
Then follow the example code shown here, How do you programmatically fill in a form and 'POST' a web page?
然后按照此处显示的示例代码,如何以编程方式填写表单并“发布”网页?
回答by Patrick Geyer
For this you will have to look into the page source of the 3rd party site and find the id of the username, password textbox and submit button. (If you provide a link I would check it for you). Then use this code:
为此,您必须查看第 3 方站点的页面源代码并找到用户名、密码文本框和提交按钮的 ID。(如果您提供链接,我会为您检查)。然后使用此代码:
//add a reference to Microsoft.mshtml in solution explorer
using mshtml;
private SHDocVw.WebBrowser_V1 Web_V1;
Form1_Load()
{
    Web_V1 = (SHDocVw.WebBrowser_V1)webBrowser1.ActiveXInstance;
}
webBrowser1_Document_Complete()
{
if (webBrowser1.ReadyState == WebBrowserReadyState.Complete)
    {
        if (webBrowser1.Url.ToString() == "YourLoginSite.Com")
        {
            try
            {
                HTMLDocument pass = new HTMLDocument();
                pass = (HTMLDocument)Web_V1.Document;
                HTMLInputElement passBox = (HTMLInputElement)pass.all.item("PassIDThatyoufoundinsource", 0);
                passBox.value = "YourPassword";
                HTMLDocument log = new HTMLDocument();
                log = (HTMLDocument)Web_V1.Document;
                HTMLInputElement logBox = (HTMLInputElement)log.all.item("loginidfrompagesource", 0);
                logBox.value = "yourlogin";
                HTMLInputElement submit = (HTMLInputElement)pass.all.item("SubmitButtonIDFromPageSource", 0);
                submit.click();
            }
            catch { }
        }
    }
}
回答by Mark Micallef
The important thing here is that you're simulating a browser POST event. Don't worry about text boxes and other visual form elements, your goal is to generate a HTTP POST request with the appropriate key-value pairs.
这里重要的是您正在模拟浏览器 POST 事件。不要担心文本框和其他视觉表单元素,您的目标是使用适当的键值对生成 HTTP POST 请求。
Your first step is to look through the HTML of the page you're pretend to be and figure out the names of the user id and password form elements. So, let's say for example that they're called "txtUsername" and "txtPassword" respectively, then the post arguments that the browser (or user-agent) will be sending up in its POST request will besomething like:
您的第一步是查看您假装的页面的 HTML 并找出用户 ID 和密码表单元素的名称。因此,假设它们分别被称为“txtUsername”和“txtPassword”,那么浏览器(或用户代理)将在其 POST 请求中发送的 post 参数将类似于:
txtUsername=fflintstone&txtPassword=ilikerocks
As a background to this, you might like to do a little research on how HTTP works. But I'll leave that to you.
作为这方面的背景,您可能想对 HTTP 的工作原理进行一些研究。但我会把它留给你。
The other important thing is to figure out what URL it posts this login request to. Normally, this is whatever appears in the address bar of the browser when you log in, but it may be something else. You'll need to check the action attribute of the form element so see where it goes.
另一个重要的事情是弄清楚它将此登录请求发布到哪个 URL。通常,这是您登录时浏览器地址栏中显示的任何内容,但也可能是其他内容。您需要检查表单元素的 action 属性,以便查看它的位置。
It may be useful to download a copy of Fiddler2. Yes, weird name, but it's a great web debugging tool that basically acts as a proxy and captures everything going between the browser and the remote host. Once you figure out how to use it, you can then pull apart each request-response to see what's happening. It'll give you the URL being called, the type of the request (usually GET or POST), the request arguments, and the full text of the response.
下载 Fiddler2 的副本可能很有用。是的,奇怪的名字,但它是一个很棒的网络调试工具,基本上充当代理并捕获浏览器和远程主机之间的所有内容。一旦你弄清楚如何使用它,你就可以将每个请求-响应分开,看看发生了什么。它会给你被调用的 URL、请求的类型(通常是 GET 或 POST)、请求参数和响应的全文。
Now, you want to build your app. You need to build logic which make the correct HTTP requests, pass in the form arguments, and get back the results. Luckily, the System.Net.HttpWebRequest class will help you do just that.
现在,您想要构建您的应用程序。您需要构建发出正确 HTTP 请求、传递表单参数并返回结果的逻辑。幸运的是,System.Net.HttpWebRequest 类将帮助您做到这一点。
Let's say the login page is at www.hello.org/login.aspx and it expects you to POST the login arguments. So your code might look something like this (obviously, this is very simplified):
假设登录页面位于 www.hello.org/login.aspx 并且它希望您发布登录参数。所以你的代码可能看起来像这样(显然,这是非常简化的):
Imports System.IO
Imports System.Net
Imports System.Web
Dim uri As String = "http://www.hello.org/login.aspx"
Dim request As HttpWebRequest = DirectCast(WebRequest.Create(uri), HttpWebRequest)
request.Timeout = 10000 ' 10 seconds
request.UserAgent = "FlintstoneFetcher/1.0" ' or whatever
request.Accept = "text/*"
request.Headers.Add("Accept-Language", "en")
request.Method = "POST"
Dim data As Byte() = New ASCIIEncoding().GetBytes("txtUsername=fflintstone&txtPassword=ilikerocks")
request.ContentType = "application/x-www-form-urlencoded"
request.ContentLength = data.Length
Dim postStream As Stream = request.GetRequestStream()
postStream.Write(data, 0, data.Length)
postStream.Close()
Dim webResponse As HttpWebResponse
webResponse = DirectCast(request.GetResponse(), HttpWebResponse)
Dim streamReader As StreamReader = New StreamReader(webResponse.GetResponseStream(), Encoding.GetEncoding(1252))
Dim response As String = streamReader.ReadToEnd()
streamReader.Close()
webResponse.Close()
The response string now contains the full response text from the remote host, and that host should consider you logged in. You may need to do a little extra work if the remote host is trying to set cookies (you'll need to return those cookies). Alternatively, if it expects you to pass integrated authentication on successive pages, you'll need to add credentials to your successive requests, something like:
响应字符串现在包含来自远程主机的完整响应文本,该主机应该认为您已登录。如果远程主机尝试设置 cookie,您可能需要做一些额外的工作(您需要返回这些 cookie )。或者,如果它希望您在连续页面上通过集成身份验证,则需要向连续请求添加凭据,例如:
request.Credentials = New NetworkCredential(theUsername, thePassword)
That should be enough information to get cracking. I would recommend that you modularise your logic for working with HTTP into a class of its own. I've implemented a complex solution that logs into a certain website, navigates to a pre-determined page, parses the html and looks for a daily file to be downloaded in the "invox" and if it exists then downloads it. I set this up as a batch process which runs each morning, saving someone having to do this manually. Hopefully, my experience will benefit you!
这应该是足够的信息来破解。我建议您将使用 HTTP 的逻辑模块化为自己的类。我已经实现了一个复杂的解决方案,它登录到某个网站,导航到一个预先确定的页面,解析 html 并在“invox”中查找要下载的每日文件,如果存在则下载它。我将其设置为每天早上运行的批处理过程,从而节省了必须手动执行此操作的人。希望我的经验对你有所帮助!

